diff mbox

[net] gre: Fix MTU sizing check for gretap tunnels

Message ID 20130711201152.8775.40579.stgit@ahduyck-hc1.jf.intel.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Duyck, Alexander H July 11, 2013, 8:12 p.m. UTC
This change fixes an MTU sizing issue seen with gretap tunnels when non-gso
packets are sent from the interface.

In my case I was able to reproduce the issue by simply sending a ping of
1421 bytes with the gretap interface created on a device with a standard
1500 mtu.

This fix is based on the fact that the tunnel mtu is already adjusted by
dev->hard_header_len so it would make sense that any packets being compared
against that mtu should also be adjusted by hard_header_len and the tunnel
header instead of just the tunnel header.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 net/ipv4/ip_tunnel.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Eric Dumazet July 11, 2013, 9:52 p.m. UTC | #1
On Thu, 2013-07-11 at 13:12 -0700, Alexander Duyck wrote:
> This change fixes an MTU sizing issue seen with gretap tunnels when non-gso
> packets are sent from the interface.
> 
> In my case I was able to reproduce the issue by simply sending a ping of
> 1421 bytes with the gretap interface created on a device with a standard
> 1500 mtu.
> 
> This fix is based on the fact that the tunnel mtu is already adjusted by
> dev->hard_header_len so it would make sense that any packets being compared
> against that mtu should also be adjusted by hard_header_len and the tunnel
> header instead of just the tunnel header.
> 
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> ---
> 
>  net/ipv4/ip_tunnel.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
> index 945734b..ca1cb2d 100644
> --- a/net/ipv4/ip_tunnel.c
> +++ b/net/ipv4/ip_tunnel.c
> @@ -476,7 +476,7 @@ static int tnl_update_pmtu(struct net_device *dev, struct sk_buff *skb,
>  			    struct rtable *rt, __be16 df)
>  {
>  	struct ip_tunnel *tunnel = netdev_priv(dev);
> -	int pkt_size = skb->len - tunnel->hlen;
> +	int pkt_size = skb->len - tunnel->hlen - dev->hard_header_len;
>  	int mtu;
>  
>  

Reported-by: Cong Wang <amwang@redhat.com>

Acked-by: Eric Dumazet <edumazet@google.com>


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pravin B Shelar July 11, 2013, 10:24 p.m. UTC | #2
On Thu, Jul 11, 2013 at 1:12 PM, Alexander Duyck
<alexander.h.duyck@intel.com> wrote:
> This change fixes an MTU sizing issue seen with gretap tunnels when non-gso
> packets are sent from the interface.
>
> In my case I was able to reproduce the issue by simply sending a ping of
> 1421 bytes with the gretap interface created on a device with a standard
> 1500 mtu.
>
> This fix is based on the fact that the tunnel mtu is already adjusted by
> dev->hard_header_len so it would make sense that any packets being compared
> against that mtu should also be adjusted by hard_header_len and the tunnel
> header instead of just the tunnel header.
>
we can simplify code by not doing dev->hard_header_len adjustment to tunnel-mtu.

And right thing would be adjusting tunnel-mtu according to rt->dst.dev
header-len so that we get mtu for out going path.

> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> ---
>
>  net/ipv4/ip_tunnel.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
> index 945734b..ca1cb2d 100644
> --- a/net/ipv4/ip_tunnel.c
> +++ b/net/ipv4/ip_tunnel.c
> @@ -476,7 +476,7 @@ static int tnl_update_pmtu(struct net_device *dev, struct sk_buff *skb,
>                             struct rtable *rt, __be16 df)
>  {
>         struct ip_tunnel *tunnel = netdev_priv(dev);
> -       int pkt_size = skb->len - tunnel->hlen;
> +       int pkt_size = skb->len - tunnel->hlen - dev->hard_header_len;
>         int mtu;
>
>         if (df)
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet July 11, 2013, 10:45 p.m. UTC | #3
On Thu, 2013-07-11 at 15:24 -0700, Pravin Shelar wrote:
> On Thu, Jul 11, 2013 at 1:12 PM, Alexander Duyck
> <alexander.h.duyck@intel.com> wrote:
> > This change fixes an MTU sizing issue seen with gretap tunnels when non-gso
> > packets are sent from the interface.
> >
> > In my case I was able to reproduce the issue by simply sending a ping of
> > 1421 bytes with the gretap interface created on a device with a standard
> > 1500 mtu.
> >
> > This fix is based on the fact that the tunnel mtu is already adjusted by
> > dev->hard_header_len so it would make sense that any packets being compared
> > against that mtu should also be adjusted by hard_header_len and the tunnel
> > header instead of just the tunnel header.
> >
> we can simplify code by not doing dev->hard_header_len adjustment to tunnel-mtu.
> 
> And right thing would be adjusting tunnel-mtu according to rt->dst.dev
> header-len so that we get mtu for out going path.

What's the mtu value we want to put in the ICMP message ?


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pravin B Shelar July 11, 2013, 11:19 p.m. UTC | #4
On Thu, Jul 11, 2013 at 3:45 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2013-07-11 at 15:24 -0700, Pravin Shelar wrote:
>> On Thu, Jul 11, 2013 at 1:12 PM, Alexander Duyck
>> <alexander.h.duyck@intel.com> wrote:
>> > This change fixes an MTU sizing issue seen with gretap tunnels when non-gso
>> > packets are sent from the interface.
>> >
>> > In my case I was able to reproduce the issue by simply sending a ping of
>> > 1421 bytes with the gretap interface created on a device with a standard
>> > 1500 mtu.
>> >
>> > This fix is based on the fact that the tunnel mtu is already adjusted by
>> > dev->hard_header_len so it would make sense that any packets being compared
>> > against that mtu should also be adjusted by hard_header_len and the tunnel
>> > header instead of just the tunnel header.
>> >
>> we can simplify code by not doing dev->hard_header_len adjustment to tunnel-mtu.
>>
>> And right thing would be adjusting tunnel-mtu according to rt->dst.dev
>> header-len so that we get mtu for out going path.
>
> What's the mtu value we want to put in the ICMP message ?
>
>
I think it should be max payload that tunnel-device can take for that
route. Something like (route_mtu - (tunnel_header_len + iph_len +
route_dev->header_len))

gre is been using dev->hard_header_len rather than
rt_dev->hard_header_len for long time which does not look right.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller July 11, 2013, 11:30 p.m. UTC | #5
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 11 Jul 2013 14:52:27 -0700

> On Thu, 2013-07-11 at 13:12 -0700, Alexander Duyck wrote:
>> This change fixes an MTU sizing issue seen with gretap tunnels when non-gso
>> packets are sent from the interface.
>> 
>> In my case I was able to reproduce the issue by simply sending a ping of
>> 1421 bytes with the gretap interface created on a device with a standard
>> 1500 mtu.
>> 
>> This fix is based on the fact that the tunnel mtu is already adjusted by
>> dev->hard_header_len so it would make sense that any packets being compared
>> against that mtu should also be adjusted by hard_header_len and the tunnel
>> header instead of just the tunnel header.
>> 
>> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
...
> Reported-by: Cong Wang <amwang@redhat.com>
> 
> Acked-by: Eric Dumazet <edumazet@google.com>

Applied, thanks everyone.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller July 11, 2013, 11:30 p.m. UTC | #6
From: Pravin Shelar <pshelar@nicira.com>
Date: Thu, 11 Jul 2013 16:19:17 -0700

> gre is been using dev->hard_header_len rather than
> rt_dev->hard_header_len for long time which does not look right.

I noticed this as well.

I would suggest implementing these calculations in small discrete
helper functions, with big comments.  Right now the code is hard
to understand even by experts.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesse Gross July 11, 2013, 11:36 p.m. UTC | #7
On Thu, Jul 11, 2013 at 4:19 PM, Pravin Shelar <pshelar@nicira.com> wrote:
> On Thu, Jul 11, 2013 at 3:45 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> On Thu, 2013-07-11 at 15:24 -0700, Pravin Shelar wrote:
>>> On Thu, Jul 11, 2013 at 1:12 PM, Alexander Duyck
>>> <alexander.h.duyck@intel.com> wrote:
>>> > This change fixes an MTU sizing issue seen with gretap tunnels when non-gso
>>> > packets are sent from the interface.
>>> >
>>> > In my case I was able to reproduce the issue by simply sending a ping of
>>> > 1421 bytes with the gretap interface created on a device with a standard
>>> > 1500 mtu.
>>> >
>>> > This fix is based on the fact that the tunnel mtu is already adjusted by
>>> > dev->hard_header_len so it would make sense that any packets being compared
>>> > against that mtu should also be adjusted by hard_header_len and the tunnel
>>> > header instead of just the tunnel header.
>>> >
>>> we can simplify code by not doing dev->hard_header_len adjustment to tunnel-mtu.
>>>
>>> And right thing would be adjusting tunnel-mtu according to rt->dst.dev
>>> header-len so that we get mtu for out going path.
>>
>> What's the mtu value we want to put in the ICMP message ?
>>
>>
> I think it should be max payload that tunnel-device can take for that
> route. Something like (route_mtu - (tunnel_header_len + iph_len +
> route_dev->header_len))
>
> gre is been using dev->hard_header_len rather than
> rt_dev->hard_header_len for long time which does not look right.

I think that it is trying to use the tunnel device's header length to
get the payload length. The MTU of the output device should already
take into account its header length. I agree that the code is hard to
read right now though.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander H Duyck July 12, 2013, 12:13 a.m. UTC | #8
On 07/11/2013 04:36 PM, Jesse Gross wrote:
> On Thu, Jul 11, 2013 at 4:19 PM, Pravin Shelar <pshelar@nicira.com> wrote:
>> On Thu, Jul 11, 2013 at 3:45 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>> On Thu, 2013-07-11 at 15:24 -0700, Pravin Shelar wrote:
>>>> On Thu, Jul 11, 2013 at 1:12 PM, Alexander Duyck
>>>> <alexander.h.duyck@intel.com> wrote:
>>>>> This change fixes an MTU sizing issue seen with gretap tunnels when non-gso
>>>>> packets are sent from the interface.
>>>>>
>>>>> In my case I was able to reproduce the issue by simply sending a ping of
>>>>> 1421 bytes with the gretap interface created on a device with a standard
>>>>> 1500 mtu.
>>>>>
>>>>> This fix is based on the fact that the tunnel mtu is already adjusted by
>>>>> dev->hard_header_len so it would make sense that any packets being compared
>>>>> against that mtu should also be adjusted by hard_header_len and the tunnel
>>>>> header instead of just the tunnel header.
>>>>>
>>>> we can simplify code by not doing dev->hard_header_len adjustment to tunnel-mtu.
>>>>
>>>> And right thing would be adjusting tunnel-mtu according to rt->dst.dev
>>>> header-len so that we get mtu for out going path.
>>> What's the mtu value we want to put in the ICMP message ?
>>>
>>>
>> I think it should be max payload that tunnel-device can take for that
>> route. Something like (route_mtu - (tunnel_header_len + iph_len +
>> route_dev->header_len))
>>
>> gre is been using dev->hard_header_len rather than
>> rt_dev->hard_header_len for long time which does not look right.
> I think that it is trying to use the tunnel device's header length to
> get the payload length. The MTU of the output device should already
> take into account its header length. I agree that the code is hard to
> read right now though

That is what I assume as well.  The only issue was the fact that after
the recent changes the calculation was off as it was including the
Ethernet header size when calculating the network packet size in the
case of gretap.  Prior to recent changes this code just pulled the
network packet size out of the network header when comparing it to the MTU.

Thanks,

Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Cong Wang July 12, 2013, 1:11 a.m. UTC | #9
On Thu, 11 Jul 2013 at 20:12 GMT, Alexander Duyck <alexander.h.duyck@intel.com> wrote:
> This change fixes an MTU sizing issue seen with gretap tunnels when non-gso
> packets are sent from the interface.
>
> In my case I was able to reproduce the issue by simply sending a ping of
> 1421 bytes with the gretap interface created on a device with a standard
> 1500 mtu.
>
> This fix is based on the fact that the tunnel mtu is already adjusted by
> dev->hard_header_len so it would make sense that any packets being compared
> against that mtu should also be adjusted by hard_header_len and the tunnel
> header instead of just the tunnel header.
>


This patch indeed fixes the performance problem I reported, nice work!

Thanks for the patch, Alexander!

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index 945734b..ca1cb2d 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -476,7 +476,7 @@  static int tnl_update_pmtu(struct net_device *dev, struct sk_buff *skb,
 			    struct rtable *rt, __be16 df)
 {
 	struct ip_tunnel *tunnel = netdev_priv(dev);
-	int pkt_size = skb->len - tunnel->hlen;
+	int pkt_size = skb->len - tunnel->hlen - dev->hard_header_len;
 	int mtu;
 
 	if (df)