Message ID | 20150203120140.GU13046@secunet.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
在 2015/2/3 20:01, Steffen Klassert 写道: > On Tue, Feb 03, 2015 at 06:54:19PM +0800, shengyong wrote: >> >> >> 在 2015/2/3 17:28, Steffen Klassert 写道: >>> On Mon, Feb 02, 2015 at 04:20:24PM +0800, shengyong wrote: >>> >>> We first need to find out why you receive this Packet Too Big message, >> The packet is sent by a commercial-off-the-shelf testcase, and I can reproduce the >> situation by using scapy and creating a packet as the following: >> >> $ cat packet-too-big.py >> #!/usr/bin/python >> >> from scapy.all import * >> >> # fe80::800:27ff:fe00:0 is linklocal addr of PC >> # fe80::a00:27ff:fe1a:e2a0 is linklocal addr of VM >> base=IPv6(src='fe80::800:27ff:fe00:0',dst='fe80::a00:27ff:fe1a:e2a0') >> pkt_too_big=ICMPv6PacketTooBig(mtu=1024) >> ext_base=IPv6(src='fe80::a00:27ff:fe1a:e2a0',dst='fe80::a00:27ff:fe1a:e2a0',plen=24) >> ext_nd_na=ICMPv6ND_NA() >> >> packet=base/pkt_too_big/ext_base/ext_nd_na >> send(packet) > > So it is not a valid pmtu update, this make life easier. > > Can you please test the patch below (compile tested only)? Sorry, the later. I test it on 3.10-stable. It can fix this problem. So maybe this is a bug? And the 3 approaches (different flags are used: RTF_LOCAL, IFF_LOOPBACK and RTF_CACHE) in these mails can fix the expire of local address. I'm confused about these flags: * RTF_LOCAL: the entries of local address, like address binded to the native NIC * RTF_CACHE: all cached entries * IFF_LOOPBACK: this is a device-related flag, which has the same meaning as RTF_LOCAL Am I right? If so, I think RTF_LOCAL is appropriate, because we just want entries of local addresses to be not expired and we don't care other entries (I think if they get expired, a neigh discovery could find them back). thx, Sheng > > This should fix your problem, and in combination with the two patches I sent > out last week, it should cure the whole 'expiring of uncached routes' problem. > > diff --git a/net/ipv6/route.c b/net/ipv6/route.c > index 49596535..4ccfb9c 100644 > --- a/net/ipv6/route.c > +++ b/net/ipv6/route.c > @@ -1156,7 +1156,8 @@ static void ip6_rt_update_pmtu(struct dst_entry *dst, struct sock *sk, > struct rt6_info *rt6 = (struct rt6_info *)dst; > > dst_confirm(dst); > - if (mtu < dst_mtu(dst) && rt6->rt6i_dst.plen == 128) { > + if (mtu < dst_mtu(dst) && rt6->rt6i_dst.plen == 128 && > + (rt6->rt6i_flags & RTF_CACHE)) { > struct net *net = dev_net(dst->dev); > > rt6->rt6i_flags |= RTF_MODIFIED; > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > . > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Feb 04, 2015 at 09:59:54AM +0800, shengyong wrote: > > Sorry, the later. I test it on 3.10-stable. It can fix this problem. So maybe this is a bug? Yes, it's a bug. > And the 3 approaches (different flags are used: RTF_LOCAL, IFF_LOOPBACK and RTF_CACHE) in > these mails can fix the expire of local address. I'm confused about these flags: > * RTF_LOCAL: the entries of local address, like address binded to the native NIC > * RTF_CACHE: all cached entries > * IFF_LOOPBACK: this is a device-related flag, which has the same meaning as RTF_LOCAL > > Am I right? If so, I think RTF_LOCAL is appropriate, because we just want entries of local > addresses to be not expired and we don't care other entries (I think if they get expired, > a neigh discovery could find them back). It is not the address that expires, it is the learned PMTU value that expires. If we delete an uncached route, nothing will bring it back unless you readd it manually. We need to ensure that all routes that can learn something what expires are cached. This means that we clone the inserted route when it is used for the first time. The learned values are stored at this cloned route. If the learned value expires, the clone is deleted. The original route remains in the fib tree and can be still looked up. The problem is, that we currently don't cache/clone host routes. So if a host route learns something that expires, the original route is removed from the fib tree and we loose connectivity to that host. We don't cache host routes because some of them (the local ones) are automatically added with metric 0. If we try to cache such a route, the clone will be identical to the original route and we fail to insert it to the fib tree. So we need to adjust the caching to all routes that actually can learn something and leave out only those that can not. I'll send a patchset that should fix this at the beginning of the next week. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
在 2015/2/5 15:21, Steffen Klassert 写道: > On Wed, Feb 04, 2015 at 09:59:54AM +0800, shengyong wrote: >> >> Sorry, the later. I test it on 3.10-stable. It can fix this problem. So maybe this is a bug? > > Yes, it's a bug. > >> And the 3 approaches (different flags are used: RTF_LOCAL, IFF_LOOPBACK and RTF_CACHE) in >> these mails can fix the expire of local address. I'm confused about these flags: >> * RTF_LOCAL: the entries of local address, like address binded to the native NIC >> * RTF_CACHE: all cached entries >> * IFF_LOOPBACK: this is a device-related flag, which has the same meaning as RTF_LOCAL >> >> Am I right? If so, I think RTF_LOCAL is appropriate, because we just want entries of local >> addresses to be not expired and we don't care other entries (I think if they get expired, >> a neigh discovery could find them back). > > It is not the address that expires, it is the learned PMTU value that > expires. If we delete an uncached route, nothing will bring it back > unless you readd it manually. > > We need to ensure that all routes that can learn something what > expires are cached. This means that we clone the inserted route > when it is used for the first time. The learned values are stored > at this cloned route. If the learned value expires, the clone is > deleted. The original route remains in the fib tree and can be > still looked up. > > The problem is, that we currently don't cache/clone host routes. > So if a host route learns something that expires, the original > route is removed from the fib tree and we loose connectivity > to that host. We don't cache host routes because some of them > (the local ones) are automatically added with metric 0. > If we try to cache such a route, the clone will be identical > to the original route and we fail to insert it to the fib tree. > > So we need to adjust the caching to all routes that actually can > learn something and leave out only those that can not. > > I'll send a patchset that should fix this at the beginning of the > next week. Hi, Steffen Is this patchset ready? It seems that I didn't find it in the mainling list. If it is ready, I can test it to see if it solves the problem I met :) many thanks, Sheng > > . > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Feb 27, 2015 at 10:37:46AM +0800, shengyong wrote: > Hi, Steffen > Is this patchset ready? It seems that I didn't find it in the mainling > list. If it is ready, I can test it to see if it solves the problem I > met :) Martin Lau pointed me to a problem when cached host routes are deleted. We may delete the clone but not the original route. This has to be resolved first. Unfortunately I had no time to look into this so far. I hope to get to it next week. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Feb 27, 2015 at 10:37:46AM +0800, shengyong wrote: > > > 在 2015/2/5 15:21, Steffen Klassert 写道: > > On Wed, Feb 04, 2015 at 09:59:54AM +0800, shengyong wrote: > >> > >> Sorry, the later. I test it on 3.10-stable. It can fix this problem. So maybe this is a bug? > > > > Yes, it's a bug. > > > >> And the 3 approaches (different flags are used: RTF_LOCAL, IFF_LOOPBACK and RTF_CACHE) in > >> these mails can fix the expire of local address. I'm confused about these flags: > >> * RTF_LOCAL: the entries of local address, like address binded to the native NIC > >> * RTF_CACHE: all cached entries > >> * IFF_LOOPBACK: this is a device-related flag, which has the same meaning as RTF_LOCAL > >> > >> Am I right? If so, I think RTF_LOCAL is appropriate, because we just want entries of local > >> addresses to be not expired and we don't care other entries (I think if they get expired, > >> a neigh discovery could find them back). > > > > It is not the address that expires, it is the learned PMTU value that > > expires. If we delete an uncached route, nothing will bring it back > > unless you readd it manually. > > > > We need to ensure that all routes that can learn something what > > expires are cached. This means that we clone the inserted route > > when it is used for the first time. The learned values are stored > > at this cloned route. If the learned value expires, the clone is > > deleted. The original route remains in the fib tree and can be > > still looked up. > > > > The problem is, that we currently don't cache/clone host routes. > > So if a host route learns something that expires, the original > > route is removed from the fib tree and we loose connectivity > > to that host. We don't cache host routes because some of them > > (the local ones) are automatically added with metric 0. > > If we try to cache such a route, the clone will be identical > > to the original route and we fail to insert it to the fib tree. > > > > So we need to adjust the caching to all routes that actually can > > learn something and leave out only those that can not. > > > > I'll send a patchset that should fix this at the beginning of the > > next week. > Hi, Steffen > Is this patchset ready? It seems that I didn't find it in the mainling > list. If it is ready, I can test it to see if it solves the problem I > met :) I finally end up with a patchset that passed all your testcases. I'll send it in reply to this mail, please test it. Thanks! -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3/30/2015 6:32 PM, Steffen Klassert wrote: > On Fri, Feb 27, 2015 at 10:37:46AM +0800, shengyong wrote: >> >> >> 在 2015/2/5 15:21, Steffen Klassert 写道: >>> On Wed, Feb 04, 2015 at 09:59:54AM +0800, shengyong wrote: >>>> >>>> Sorry, the later. I test it on 3.10-stable. It can fix this problem. So maybe this is a bug? >>> >>> Yes, it's a bug. >>> >>>> And the 3 approaches (different flags are used: RTF_LOCAL, IFF_LOOPBACK and RTF_CACHE) in >>>> these mails can fix the expire of local address. I'm confused about these flags: >>>> * RTF_LOCAL: the entries of local address, like address binded to the native NIC >>>> * RTF_CACHE: all cached entries >>>> * IFF_LOOPBACK: this is a device-related flag, which has the same meaning as RTF_LOCAL >>>> >>>> Am I right? If so, I think RTF_LOCAL is appropriate, because we just want entries of local >>>> addresses to be not expired and we don't care other entries (I think if they get expired, >>>> a neigh discovery could find them back). >>> >>> It is not the address that expires, it is the learned PMTU value that >>> expires. If we delete an uncached route, nothing will bring it back >>> unless you readd it manually. >>> >>> We need to ensure that all routes that can learn something what >>> expires are cached. This means that we clone the inserted route >>> when it is used for the first time. The learned values are stored >>> at this cloned route. If the learned value expires, the clone is >>> deleted. The original route remains in the fib tree and can be >>> still looked up. >>> >>> The problem is, that we currently don't cache/clone host routes. >>> So if a host route learns something that expires, the original >>> route is removed from the fib tree and we loose connectivity >>> to that host. We don't cache host routes because some of them >>> (the local ones) are automatically added with metric 0. >>> If we try to cache such a route, the clone will be identical >>> to the original route and we fail to insert it to the fib tree. >>> >>> So we need to adjust the caching to all routes that actually can >>> learn something and leave out only those that can not. >>> >>> I'll send a patchset that should fix this at the beginning of the >>> next week. >> Hi, Steffen >> Is this patchset ready? It seems that I didn't find it in the mainling >> list. If it is ready, I can test it to see if it solves the problem I >> met :) > > I finally end up with a patchset that passed all your testcases. > I'll send it in reply to this mail, please test it. > I just test the patchset, and it can solve the problem I met. Thanks very much, Sheng > Thanks! > > . > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 49596535..4ccfb9c 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1156,7 +1156,8 @@ static void ip6_rt_update_pmtu(struct dst_entry *dst, struct sock *sk, struct rt6_info *rt6 = (struct rt6_info *)dst; dst_confirm(dst); - if (mtu < dst_mtu(dst) && rt6->rt6i_dst.plen == 128) { + if (mtu < dst_mtu(dst) && rt6->rt6i_dst.plen == 128 && + (rt6->rt6i_flags & RTF_CACHE)) { struct net *net = dev_net(dst->dev); rt6->rt6i_flags |= RTF_MODIFIED;