Patchwork net: Fix IPv6 PMTU disc. w/ asymmetric routes

login
register
mail settings
Submitter David Miller
Date Oct. 3, 2010, 9:49 p.m.
Message ID <20101003.144931.71122620.davem@davemloft.net>
Download mbox | patch
Permalink /patch/66606/
State Accepted
Delegated to: David Miller
Headers show

Comments

David Miller - Oct. 3, 2010, 9:49 p.m.
From: David Miller <davem@davemloft.net>
Date: Thu, 30 Sep 2010 00:41:36 -0700 (PDT)

> Maybe the problem is that the ipv6 side uses the same saddr for both
> the lookup and the entry comparison in these PMTU code paths?  Does it
> not allow specifying them seperately as the ipv4 PMTU (and incidently
> the RT redirect) code paths do?
> 
> Or is this not an issue on the ipv6 side for some reason?

Ok, meanwhile I did the research.

What ipv6 does is that when you lookup a route, it clones or copies
the prefixes route into one that is fully specified for a specific
SADDR/DADDR pair, and then inserts that specific route into the FIB6
tree.

Therefore the only cases we should lookup for PMTU discovery for ipv6
are:

	{ daddr, saddr, ifindex == 0 }
	{ daddr, saddr, ifindex == dev->ifindex }

This achieves the same effect as what ipv4 is doing.

So Maciej your original attempt was correct all along, and as a result
I'll commit the following.

Thanks!

--------------------
net: Fix IPv6 PMTU disc. w/ asymmetric routes

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 net/ipv6/route.c |   28 ++++++++++++++++++++++++----
 1 files changed, 24 insertions(+), 4 deletions(-)
Maciej Żenczykowski - Oct. 4, 2010, 12:21 a.m.
> Ok, meanwhile I did the research.
>
> What ipv6 does is that when you lookup a route, it clones or copies
> the prefixes route into one that is fully specified for a specific
> SADDR/DADDR pair, and then inserts that specific route into the FIB6
> tree.
>
> Therefore the only cases we should lookup for PMTU discovery for ipv6
> are:
>
>        { daddr, saddr, ifindex == 0 }
>        { daddr, saddr, ifindex == dev->ifindex }
>
> This achieves the same effect as what ipv4 is doing.
>
> So Maciej your original attempt was correct all along, and as a result
> I'll commit the following.
>
> Thanks!

Awesome.  I've been trying to convince myself of this and come up with
a concise explanation - but it sounds like you got there first.

[I ran into SUBTREES and got stuck in trying to understand exactly how
they work and whether they affect this at all.]
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 8323136..a275c6e 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1556,14 +1556,13 @@  out:
  *	i.e. Path MTU discovery
  */
 
-void rt6_pmtu_discovery(struct in6_addr *daddr, struct in6_addr *saddr,
-			struct net_device *dev, u32 pmtu)
+static void rt6_do_pmtu_disc(struct in6_addr *daddr, struct in6_addr *saddr,
+			     struct net *net, u32 pmtu, int ifindex)
 {
 	struct rt6_info *rt, *nrt;
-	struct net *net = dev_net(dev);
 	int allfrag = 0;
 
-	rt = rt6_lookup(net, daddr, saddr, dev->ifindex, 0);
+	rt = rt6_lookup(net, daddr, saddr, ifindex, 0);
 	if (rt == NULL)
 		return;
 
@@ -1631,6 +1630,27 @@  out:
 	dst_release(&rt->dst);
 }
 
+void rt6_pmtu_discovery(struct in6_addr *daddr, struct in6_addr *saddr,
+			struct net_device *dev, u32 pmtu)
+{
+	struct net *net = dev_net(dev);
+
+	/*
+	 * RFC 1981 states that a node "MUST reduce the size of the packets it
+	 * is sending along the path" that caused the Packet Too Big message.
+	 * Since it's not possible in the general case to determine which
+	 * interface was used to send the original packet, we update the MTU
+	 * on the interface that will be used to send future packets. We also
+	 * update the MTU on the interface that received the Packet Too Big in
+	 * case the original packet was forced out that interface with
+	 * SO_BINDTODEVICE or similar. This is the next best thing to the
+	 * correct behaviour, which would be to update the MTU on all
+	 * interfaces.
+	 */
+	rt6_do_pmtu_disc(daddr, saddr, net, pmtu, 0);
+	rt6_do_pmtu_disc(daddr, saddr, net, pmtu, dev->ifindex);
+}
+
 /*
  *	Misc support functions
  */