Message ID | 1512578299-7573-1-git-send-email-alexey.kodanev@oracle.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
Series | [net-next] ip6_vti: adjust vti mtu according to mtu of output device | expand |
On Wed, Dec 06, 2017 at 07:38:19PM +0300, Alexey Kodanev wrote: > LTP/udp6_ipsec_vti tests fail when sending large UDP datagrams > that require fragmentation and underlying device MTU <= 1500. > This happens because ip6_vti sets mtu to ETH_DATA_LEN and not > updating it depending on a destiantion address. > > Futhure attempts to send UDP packets may succeed because pmtu > get updated on ICMPV6_PKT_TOOBIG in vti6_err(). > > Here is the example when output device MTU set to 9000: > > # ip a sh ltp_ns_veth2 > ltp_ns_veth2@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 ... > inet 10.0.0.2/24 scope global ltp_ns_veth2 > inet6 fd00::2/64 scope global > ... > # ip li add vti6 type vti6 local fd00::2 remote fd00::1 > # ip li show vti6 > vti6@NONE: <POINTOPOINT,NOARP> mtu 1500 ... > link/tunnel6 fd00::2 peer fd00::1 > > After the patch: > > # ip li add vti6 type vti6 local fd00::2 remote fd00::1 > # ip li show vti6 > vti6@NONE: <POINTOPOINT,NOARP> mtu 8832 ... > link/tunnel6 fd00::2 peer fd00::1 > > Regarding ip_vti, it already tunes mtu with ip_tunnel_bind_dev(): > > # ip li add vti4 type vti local 10.0.0.2 remote 10.0.0.1 > # ip li sh vti4 > vti4@NONE: <POINTOPOINT,NOARP> mtu 8832 ... > link/ipip 10.0.0.2 peer 10.0.0.1 > > Reported-by: Petr Vorel <pvorel@suse.cz> > Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> > --- > > ip6_vti mtu offset is the same (168) as in ip_vti because ip_vti > offset includes two sizes of struct iphdr: in dev->hard_header_len > and in t_hlen in ip_tunnel_bind_dev(). I'm not sure if it's correct. > > net/ipv6/ip6_vti.c | 18 ++++++++++++++++++ > 1 files changed, 18 insertions(+), 0 deletions(-) > > diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c > index dbb74f3..47e6464 100644 > --- a/net/ipv6/ip6_vti.c > +++ b/net/ipv6/ip6_vti.c > @@ -638,6 +638,24 @@ static void vti6_link_config(struct ip6_tnl *t) > dev->flags |= IFF_POINTOPOINT; > else > dev->flags &= ~IFF_POINTOPOINT; > + > + if (p->flags & IP6_TNL_F_CAP_XMIT) { > + int strict = (ipv6_addr_type(&p->raddr) & > + (IPV6_ADDR_MULTICAST | IPV6_ADDR_LINKLOCAL)); > + > + struct rt6_info *rt = rt6_lookup(t->net, > + &p->raddr, &p->laddr, > + p->link, strict); > + > + if (!rt) > + return; > + > + if (rt->dst.dev) { > + dev->mtu = max(rt->dst.dev->mtu - dev->hard_header_len, > + IPV6_MIN_MTU); Hm, I'm gettting this when compiling with your patch: In file included from /home/klassert/git/ipsec-next/include/linux/list.h:9:0, from /home/klassert/git/ipsec-next/include/linux/module.h:9, from /home/klassert/git/ipsec-next/net/ipv6/ip6_vti.c:18: /home/klassert/git/ipsec-next/net/ipv6/ip6_vti.c: In function ‘vti6_link_config’: /home/klassert/git/ipsec-next/include/linux/kernel.h:808:16: warning: comparison of distinct pointer types lacks a cast (void) (&max1 == &max2); \ ^ /home/klassert/git/ipsec-next/include/linux/kernel.h:817:2: note: in expansion of macro ‘__max’ __max(typeof(x), typeof(y), \ ^~~~~ /home/klassert/git/ipsec-next/net/ipv6/ip6_vti.c:654:15: note: in expansion of macro ‘max’ dev->mtu = max(rt->dst.dev->mtu - dev->hard_header_len,
On 12/08/2017 10:02 AM, Steffen Klassert wrote: > On Wed, Dec 06, 2017 at 07:38:19PM +0300, Alexey Kodanev wrote: >> LTP/udp6_ipsec_vti tests fail when sending large UDP datagrams >> that require fragmentation and underlying device MTU <= 1500. >> This happens because ip6_vti sets mtu to ETH_DATA_LEN and not >> updating it depending on a destiantion address. >> >> Futhure attempts to send UDP packets may succeed because pmtu >> get updated on ICMPV6_PKT_TOOBIG in vti6_err(). >> >> Here is the example when output device MTU set to 9000: >> >> # ip a sh ltp_ns_veth2 >> ltp_ns_veth2@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 ... >> inet 10.0.0.2/24 scope global ltp_ns_veth2 >> inet6 fd00::2/64 scope global >> ... >> # ip li add vti6 type vti6 local fd00::2 remote fd00::1 >> # ip li show vti6 >> vti6@NONE: <POINTOPOINT,NOARP> mtu 1500 ... >> link/tunnel6 fd00::2 peer fd00::1 >> >> After the patch: >> >> # ip li add vti6 type vti6 local fd00::2 remote fd00::1 >> # ip li show vti6 >> vti6@NONE: <POINTOPOINT,NOARP> mtu 8832 ... >> link/tunnel6 fd00::2 peer fd00::1 >> >> Regarding ip_vti, it already tunes mtu with ip_tunnel_bind_dev(): >> >> # ip li add vti4 type vti local 10.0.0.2 remote 10.0.0.1 >> # ip li sh vti4 >> vti4@NONE: <POINTOPOINT,NOARP> mtu 8832 ... >> link/ipip 10.0.0.2 peer 10.0.0.1 >> >> Reported-by: Petr Vorel <pvorel@suse.cz> >> Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> >> --- >> >> ip6_vti mtu offset is the same (168) as in ip_vti because ip_vti >> offset includes two sizes of struct iphdr: in dev->hard_header_len >> and in t_hlen in ip_tunnel_bind_dev(). I'm not sure if it's correct. >> >> net/ipv6/ip6_vti.c | 18 ++++++++++++++++++ >> 1 files changed, 18 insertions(+), 0 deletions(-) >> >> diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c >> index dbb74f3..47e6464 100644 >> --- a/net/ipv6/ip6_vti.c >> +++ b/net/ipv6/ip6_vti.c >> @@ -638,6 +638,24 @@ static void vti6_link_config(struct ip6_tnl *t) >> dev->flags |= IFF_POINTOPOINT; >> else >> dev->flags &= ~IFF_POINTOPOINT; >> + >> + if (p->flags & IP6_TNL_F_CAP_XMIT) { >> + int strict = (ipv6_addr_type(&p->raddr) & >> + (IPV6_ADDR_MULTICAST | IPV6_ADDR_LINKLOCAL)); >> + >> + struct rt6_info *rt = rt6_lookup(t->net, >> + &p->raddr, &p->laddr, >> + p->link, strict); >> + >> + if (!rt) >> + return; >> + >> + if (rt->dst.dev) { >> + dev->mtu = max(rt->dst.dev->mtu - dev->hard_header_len, >> + IPV6_MIN_MTU); > > Hm, I'm gettting this when compiling with your patch: > > In file included from /home/klassert/git/ipsec-next/include/linux/list.h:9:0, > from /home/klassert/git/ipsec-next/include/linux/module.h:9, > from /home/klassert/git/ipsec-next/net/ipv6/ip6_vti.c:18: > /home/klassert/git/ipsec-next/net/ipv6/ip6_vti.c: In function ‘vti6_link_config’: > /home/klassert/git/ipsec-next/include/linux/kernel.h:808:16: warning: comparison of distinct pointer types lacks a cast > (void) (&max1 == &max2); \ > ^ > /home/klassert/git/ipsec-next/include/linux/kernel.h:817:2: note: in expansion of macro ‘__max’ > __max(typeof(x), typeof(y), \ > ^~~~~ > /home/klassert/git/ipsec-next/net/ipv6/ip6_vti.c:654:15: note: in expansion of macro ‘max’ > dev->mtu = max(rt->dst.dev->mtu - dev->hard_header_len, > rt->dst.dev->mtu and dev->hard_header_len are both unsigned and IPV6_MIN_MTU considered as int, I guess IPV6_MIN_MTU can be changed to dev->min_mtu as it is set to the same value in setup, but checking in the way it is done in ip6_tnl_link_config() looks better. I'll send 2nd version. Thanks, Alexey
On 12/8/2017 3:54 AM, Alexey Kodanev wrote: > On 12/08/2017 10:02 AM, Steffen Klassert wrote: >> On Wed, Dec 06, 2017 at 07:38:19PM +0300, Alexey Kodanev wrote: Since you're planning to do a 2nd version anyway, can we get a couple of the commit message issues cleaned up? >>> LTP/udp6_ipsec_vti tests fail when sending large UDP datagrams >>> that require fragmentation and underlying device MTU <= 1500. s/underlying device/the underlying device has/ >>> This happens because ip6_vti sets mtu to ETH_DATA_LEN and not >>> updating it depending on a destiantion address. s/destiantion/destination/ >>> >>> Futhure attempts to send UDP packets may succeed because pmtu s/Futhure/Further/ >>> get updated on ICMPV6_PKT_TOOBIG in vti6_err(). s/get/gets/ >>> >>> Here is the example when output device MTU set to 9000: s/output device MTU/the output device MTU is/ Thanks, sln
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c index dbb74f3..47e6464 100644 --- a/net/ipv6/ip6_vti.c +++ b/net/ipv6/ip6_vti.c @@ -638,6 +638,24 @@ static void vti6_link_config(struct ip6_tnl *t) dev->flags |= IFF_POINTOPOINT; else dev->flags &= ~IFF_POINTOPOINT; + + if (p->flags & IP6_TNL_F_CAP_XMIT) { + int strict = (ipv6_addr_type(&p->raddr) & + (IPV6_ADDR_MULTICAST | IPV6_ADDR_LINKLOCAL)); + + struct rt6_info *rt = rt6_lookup(t->net, + &p->raddr, &p->laddr, + p->link, strict); + + if (!rt) + return; + + if (rt->dst.dev) { + dev->mtu = max(rt->dst.dev->mtu - dev->hard_header_len, + IPV6_MIN_MTU); + } + ip6_rt_put(rt); + } } /**
LTP/udp6_ipsec_vti tests fail when sending large UDP datagrams that require fragmentation and underlying device MTU <= 1500. This happens because ip6_vti sets mtu to ETH_DATA_LEN and not updating it depending on a destiantion address. Futhure attempts to send UDP packets may succeed because pmtu get updated on ICMPV6_PKT_TOOBIG in vti6_err(). Here is the example when output device MTU set to 9000: # ip a sh ltp_ns_veth2 ltp_ns_veth2@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 ... inet 10.0.0.2/24 scope global ltp_ns_veth2 inet6 fd00::2/64 scope global ... # ip li add vti6 type vti6 local fd00::2 remote fd00::1 # ip li show vti6 vti6@NONE: <POINTOPOINT,NOARP> mtu 1500 ... link/tunnel6 fd00::2 peer fd00::1 After the patch: # ip li add vti6 type vti6 local fd00::2 remote fd00::1 # ip li show vti6 vti6@NONE: <POINTOPOINT,NOARP> mtu 8832 ... link/tunnel6 fd00::2 peer fd00::1 Regarding ip_vti, it already tunes mtu with ip_tunnel_bind_dev(): # ip li add vti4 type vti local 10.0.0.2 remote 10.0.0.1 # ip li sh vti4 vti4@NONE: <POINTOPOINT,NOARP> mtu 8832 ... link/ipip 10.0.0.2 peer 10.0.0.1 Reported-by: Petr Vorel <pvorel@suse.cz> Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> --- ip6_vti mtu offset is the same (168) as in ip_vti because ip_vti offset includes two sizes of struct iphdr: in dev->hard_header_len and in t_hlen in ip_tunnel_bind_dev(). I'm not sure if it's correct. net/ipv6/ip6_vti.c | 18 ++++++++++++++++++ 1 files changed, 18 insertions(+), 0 deletions(-)