diff mbox series

[net-next,v4] ip6_vti: adjust vti mtu according to mtu of lower device

Message ID 1513691961-19692-1-git-send-email-alexey.kodanev@oracle.com
State Accepted, archived
Delegated to: David Miller
Headers show
Series [net-next,v4] ip6_vti: adjust vti mtu according to mtu of lower device | expand

Commit Message

Alexey Kodanev Dec. 19, 2017, 1:59 p.m. UTC
LTP/udp6_ipsec_vti tests fail when sending large UDP datagrams over
ip6_vti that require fragmentation and the underlying device has an
MTU smaller than 1500 plus some extra space for headers. This happens
because ip6_vti, by default, sets MTU to ETH_DATA_LEN and not updating
it depending on a destination address or link parameter. Further
attempts to send UDP packets may succeed because pmtu gets updated on
ICMPV6_PKT_TOOBIG in vti6_err().

In case the lower device has larger MTU size, e.g. 9000, ip6_vti works
but not using the possible maximum size, output packets have 1500 limit.

The above cases require manual MTU setup after ip6_vti creation. However
ip_vti already updates MTU based on lower device with ip_tunnel_bind_dev().

Here is the example when the lower device MTU is set to 9000:

  # ip a sh ltp_ns_veth2
      ltp_ns_veth2@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 ...
        inet 10.0.0.2/24 scope global ltp_ns_veth2
        inet6 fd00::2/64 scope global

  # ip li add vti6 type vti6 local fd00::2 remote fd00::1
  # ip li show vti6
      vti6@NONE: <POINTOPOINT,NOARP> mtu 1500 ...
        link/tunnel6 fd00::2 peer fd00::1

After the patch:
  # ip li add vti6 type vti6 local fd00::2 remote fd00::1
  # ip li show vti6
      vti6@NONE: <POINTOPOINT,NOARP> mtu 8832 ...
        link/tunnel6 fd00::2 peer fd00::1

Reported-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
---
v4: * remove an empty line between variable declarations
    * update the commit message to reflect unexpected behavior with an MTU
      larger than 1500.

v3: * fix style issue with curly braces around single-statement if block

v2: * cleanup commit message issues (thanks to Shannon)
    * handle the case when we don't have route but have device parameter
    * cast new MTU to int and then check the maximum (tdev->mtu can be
      less than dev->hard_header_len)

 net/ipv6/ip6_vti.c |   20 ++++++++++++++++++++
 1 files changed, 20 insertions(+), 0 deletions(-)

Comments

David Miller Dec. 20, 2017, 4:53 p.m. UTC | #1
From: Alexey Kodanev <alexey.kodanev@oracle.com>
Date: Tue, 19 Dec 2017 16:59:21 +0300

> LTP/udp6_ipsec_vti tests fail when sending large UDP datagrams over
> ip6_vti that require fragmentation and the underlying device has an
> MTU smaller than 1500 plus some extra space for headers. This happens
> because ip6_vti, by default, sets MTU to ETH_DATA_LEN and not updating
> it depending on a destination address or link parameter. Further
> attempts to send UDP packets may succeed because pmtu gets updated on
> ICMPV6_PKT_TOOBIG in vti6_err().
> 
> In case the lower device has larger MTU size, e.g. 9000, ip6_vti works
> but not using the possible maximum size, output packets have 1500 limit.
> 
> The above cases require manual MTU setup after ip6_vti creation. However
> ip_vti already updates MTU based on lower device with ip_tunnel_bind_dev().
> 
> Here is the example when the lower device MTU is set to 9000:
 ...
> Reported-by: Petr Vorel <pvorel@suse.cz>
> Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>

Applied, thanks Alexey.
diff mbox series

Patch

diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index dbb74f3..18caa95 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -626,6 +626,7 @@  static void vti6_link_config(struct ip6_tnl *t)
 {
 	struct net_device *dev = t->dev;
 	struct __ip6_tnl_parm *p = &t->parms;
+	struct net_device *tdev = NULL;
 
 	memcpy(dev->dev_addr, &p->laddr, sizeof(struct in6_addr));
 	memcpy(dev->broadcast, &p->raddr, sizeof(struct in6_addr));
@@ -638,6 +639,25 @@  static void vti6_link_config(struct ip6_tnl *t)
 		dev->flags |= IFF_POINTOPOINT;
 	else
 		dev->flags &= ~IFF_POINTOPOINT;
+
+	if (p->flags & IP6_TNL_F_CAP_XMIT) {
+		int strict = (ipv6_addr_type(&p->raddr) &
+			      (IPV6_ADDR_MULTICAST | IPV6_ADDR_LINKLOCAL));
+		struct rt6_info *rt = rt6_lookup(t->net,
+						 &p->raddr, &p->laddr,
+						 p->link, strict);
+
+		if (rt)
+			tdev = rt->dst.dev;
+		ip6_rt_put(rt);
+	}
+
+	if (!tdev && p->link)
+		tdev = __dev_get_by_index(t->net, p->link);
+
+	if (tdev)
+		dev->mtu = max_t(int, tdev->mtu - dev->hard_header_len,
+				 IPV6_MIN_MTU);
 }
 
 /**