Message ID | 20130130082608.GA1604@minipsycho.orion |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
From: Jiri Pirko <jiri@resnulli.us> Date: Wed, 30 Jan 2013 09:26:08 +0100 > From: Marcelo Ricardo Leitner <mleitner@redhat.com> > > They will be created at output, if ever needed. This avoids creating > empty neighbor entries when TPROXYing/Forwarding packets for addresses > that are not even directly reachable. > > Note that IPv4 already handles it this way. No neighbor entries are > created for local input. > > Tested by myself and customer. > > Signed-off-by: Jiri Pirko <jiri@resnulli.us> > Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Applied and queued up for -stable, thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 30, 2013 at 3:26 AM, Jiri Pirko <jiri@resnulli.us> wrote: > From: Marcelo Ricardo Leitner <mleitner@redhat.com> > > They will be created at output, if ever needed. This avoids creating > empty neighbor entries when TPROXYing/Forwarding packets for addresses > that are not even directly reachable. > > Note that IPv4 already handles it this way. No neighbor entries are > created for local input. > > Tested by myself and customer. > > Signed-off-by: Jiri Pirko <jiri@resnulli.us> > Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> > --- > net/ipv6/route.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/ipv6/route.c b/net/ipv6/route.c > index e229a3b..363d8b7 100644 > --- a/net/ipv6/route.c > +++ b/net/ipv6/route.c > @@ -928,7 +928,7 @@ restart: > dst_hold(&rt->dst); > read_unlock_bh(&table->tb6_lock); > > - if (!rt->n && !(rt->rt6i_flags & RTF_NONEXTHOP)) > + if (!rt->n && !(rt->rt6i_flags & (RTF_NONEXTHOP | RTF_LOCAL))) > nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr); > else if (!(rt->dst.flags & DST_HOST)) > nrt = rt6_alloc_clone(rt, &fl6->daddr); I'm not sure this patch is doing the right thing. It seems to break IPv6 loopback functionality, it is no longer equivalent to IPv4, as stated above. It doesn't just stop neighbor creation but it stops cached route creation. Seems like a scary change for a stable tree. See below: $ ip -4 route show local local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1 This local route enables us to use the whole loopback network, any address inside 127.0.0.0/8 will work. $ ping -c1 127.0.0.9 PING 127.0.0.9 (127.0.0.9) 56(84) bytes of data. 64 bytes from 127.0.0.9: icmp_seq=1 ttl=64 time=0.012 ms --- 127.0.0.9 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.012/0.012/0.012/0.000 ms This also used to work equivalently for IPv6 local loopback routes: $ ip -6 route add local 2001:::/64 dev lo $ ping6 -c1 2001::9 PING 2001::9(2001::9) 56 data bytes 64 bytes from 2001::9: icmp_seq=1 ttl=64 time=0.010 ms --- 2001::9 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.010/0.010/0.010/0.000 ms However with this patch, this is very broken: $ ip -6 route add local 2001::/64 dev lo $ ping6 -c1 2001::9 PING 2001::9(2001::9) 56 data bytes ping: sendmsg: Invalid argument --- 2001::9 ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms Thanks, Debabrata -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 08, 2013 at 02:45:40PM -0400, Debabrata Banerjee wrote: > On Wed, Jan 30, 2013 at 3:26 AM, Jiri Pirko <jiri@resnulli.us> wrote: > > From: Marcelo Ricardo Leitner <mleitner@redhat.com> > > > > They will be created at output, if ever needed. This avoids creating > > empty neighbor entries when TPROXYing/Forwarding packets for addresses > > that are not even directly reachable. > > > > Note that IPv4 already handles it this way. No neighbor entries are > > created for local input. > > > > Tested by myself and customer. > > > > Signed-off-by: Jiri Pirko <jiri@resnulli.us> > > Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> > > > > [...] > > I'm not sure this patch is doing the right thing. It seems to break > IPv6 loopback functionality, it is no longer equivalent to IPv4, as > stated above. It doesn't just stop neighbor creation but it stops > cached route creation. Seems like a scary change for a stable tree. > See below: > > $ ip -4 route show local > local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1 > > This local route enables us to use the whole loopback network, any > address inside 127.0.0.0/8 will work. > > $ ping -c1 127.0.0.9 > PING 127.0.0.9 (127.0.0.9) 56(84) bytes of data. > 64 bytes from 127.0.0.9: icmp_seq=1 ttl=64 time=0.012 ms > > --- 127.0.0.9 ping statistics --- > 1 packets transmitted, 1 received, 0% packet loss, time 0ms > rtt min/avg/max/mdev = 0.012/0.012/0.012/0.000 ms > > This also used to work equivalently for IPv6 local loopback routes: > > $ ip -6 route add local 2001:::/64 dev lo > $ ping6 -c1 2001::9 > PING 2001::9(2001::9) 56 data bytes > 64 bytes from 2001::9: icmp_seq=1 ttl=64 time=0.010 ms > > --- 2001::9 ping statistics --- > 1 packets transmitted, 1 received, 0% packet loss, time 0ms > rtt min/avg/max/mdev = 0.010/0.010/0.010/0.000 ms > > However with this patch, this is very broken: > > $ ip -6 route add local 2001::/64 dev lo > $ ping6 -c1 2001::9 > PING 2001::9(2001::9) 56 data bytes > ping: sendmsg: Invalid argument > > --- 2001::9 ping statistics --- > 1 packets transmitted, 0 received, 100% packet loss, time 0ms Which kernel version are you using? Perhaps you miss another fix? It works for me. Also I cannot find this patch in net-next? Greetings, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Em 08-08-2013 16:01, Hannes Frederic Sowa escreveu: > On Thu, Aug 08, 2013 at 02:45:40PM -0400, Debabrata Banerjee wrote: >> On Wed, Jan 30, 2013 at 3:26 AM, Jiri Pirko <jiri@resnulli.us> wrote: >>> From: Marcelo Ricardo Leitner <mleitner@redhat.com> >>> >>> They will be created at output, if ever needed. This avoids creating >>> empty neighbor entries when TPROXYing/Forwarding packets for addresses >>> that are not even directly reachable. >>> >>> Note that IPv4 already handles it this way. No neighbor entries are >>> created for local input. >>> >>> Tested by myself and customer. >>> >>> Signed-off-by: Jiri Pirko <jiri@resnulli.us> >>> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> >>> >>> [...] >> >> I'm not sure this patch is doing the right thing. It seems to break >> IPv6 loopback functionality, it is no longer equivalent to IPv4, as >> stated above. It doesn't just stop neighbor creation but it stops >> cached route creation. Seems like a scary change for a stable tree. >> See below: >> >> $ ip -4 route show local >> local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1 >> >> This local route enables us to use the whole loopback network, any >> address inside 127.0.0.0/8 will work. >> >> $ ping -c1 127.0.0.9 >> PING 127.0.0.9 (127.0.0.9) 56(84) bytes of data. >> 64 bytes from 127.0.0.9: icmp_seq=1 ttl=64 time=0.012 ms >> >> --- 127.0.0.9 ping statistics --- >> 1 packets transmitted, 1 received, 0% packet loss, time 0ms >> rtt min/avg/max/mdev = 0.012/0.012/0.012/0.000 ms >> >> This also used to work equivalently for IPv6 local loopback routes: >> >> $ ip -6 route add local 2001:::/64 dev lo >> $ ping6 -c1 2001::9 >> PING 2001::9(2001::9) 56 data bytes >> 64 bytes from 2001::9: icmp_seq=1 ttl=64 time=0.010 ms >> >> --- 2001::9 ping statistics --- >> 1 packets transmitted, 1 received, 0% packet loss, time 0ms >> rtt min/avg/max/mdev = 0.010/0.010/0.010/0.000 ms >> >> However with this patch, this is very broken: >> >> $ ip -6 route add local 2001::/64 dev lo >> $ ping6 -c1 2001::9 >> PING 2001::9(2001::9) 56 data bytes >> ping: sendmsg: Invalid argument >> >> --- 2001::9 ping statistics --- >> 1 packets transmitted, 0 received, 100% packet loss, time 0ms > > Which kernel version are you using? Perhaps you miss another fix? It works for > me. Also I cannot find this patch in net-next? It wasn't needed/applied as the route cache was removed. Regards, Marcelo -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 08, 2013 at 04:02:36PM -0300, Marcelo Ricardo Leitner wrote: > Em 08-08-2013 16:01, Hannes Frederic Sowa escreveu: > >On Thu, Aug 08, 2013 at 02:45:40PM -0400, Debabrata Banerjee wrote: > >>On Wed, Jan 30, 2013 at 3:26 AM, Jiri Pirko <jiri@resnulli.us> wrote: > >>>From: Marcelo Ricardo Leitner <mleitner@redhat.com> > >>I'm not sure this patch is doing the right thing. It seems to break > >>IPv6 loopback functionality, it is no longer equivalent to IPv4, as > >>stated above. It doesn't just stop neighbor creation but it stops > >>cached route creation. Seems like a scary change for a stable tree. > >>See below: > >> > >>$ ip -4 route show local > >>local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1 > >> > >>This local route enables us to use the whole loopback network, any > >>address inside 127.0.0.0/8 will work. > >> > >>$ ping -c1 127.0.0.9 > >>PING 127.0.0.9 (127.0.0.9) 56(84) bytes of data. > >>64 bytes from 127.0.0.9: icmp_seq=1 ttl=64 time=0.012 ms > >> > >>--- 127.0.0.9 ping statistics --- > >>1 packets transmitted, 1 received, 0% packet loss, time 0ms > >>rtt min/avg/max/mdev = 0.012/0.012/0.012/0.000 ms > >> > >>This also used to work equivalently for IPv6 local loopback routes: > >> > >>$ ip -6 route add local 2001:::/64 dev lo > >>$ ping6 -c1 2001::9 > >>PING 2001::9(2001::9) 56 data bytes > >>64 bytes from 2001::9: icmp_seq=1 ttl=64 time=0.010 ms > >> > >>--- 2001::9 ping statistics --- > >>1 packets transmitted, 1 received, 0% packet loss, time 0ms > >>rtt min/avg/max/mdev = 0.010/0.010/0.010/0.000 ms > >> > >>However with this patch, this is very broken: > >> > >>$ ip -6 route add local 2001::/64 dev lo > >>$ ping6 -c1 2001::9 > >>PING 2001::9(2001::9) 56 data bytes > >>ping: sendmsg: Invalid argument > >> > >>--- 2001::9 ping statistics --- > >>1 packets transmitted, 0 received, 100% packet loss, time 0ms > > > >Which kernel version are you using? Perhaps you miss another fix? It works > >for > >me. Also I cannot find this patch in net-next? > > It wasn't needed/applied as the route cache was removed. Do you mean the rt->n(eighbour) removal? There was no removal of a route cache in IPv6 land. The cache is merely in the routing table itself. Greetings, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Em 08-08-2013 16:06, Hannes Frederic Sowa escreveu: > On Thu, Aug 08, 2013 at 04:02:36PM -0300, Marcelo Ricardo Leitner wrote: >> Em 08-08-2013 16:01, Hannes Frederic Sowa escreveu: >>> On Thu, Aug 08, 2013 at 02:45:40PM -0400, Debabrata Banerjee wrote: >>>> On Wed, Jan 30, 2013 at 3:26 AM, Jiri Pirko <jiri@resnulli.us> wrote: >>>>> From: Marcelo Ricardo Leitner <mleitner@redhat.com> >>>> I'm not sure this patch is doing the right thing. It seems to break >>>> IPv6 loopback functionality, it is no longer equivalent to IPv4, as >>>> stated above. It doesn't just stop neighbor creation but it stops >>>> cached route creation. Seems like a scary change for a stable tree. >>>> See below: >>>> >>>> $ ip -4 route show local >>>> local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1 >>>> >>>> This local route enables us to use the whole loopback network, any >>>> address inside 127.0.0.0/8 will work. >>>> >>>> $ ping -c1 127.0.0.9 >>>> PING 127.0.0.9 (127.0.0.9) 56(84) bytes of data. >>>> 64 bytes from 127.0.0.9: icmp_seq=1 ttl=64 time=0.012 ms >>>> >>>> --- 127.0.0.9 ping statistics --- >>>> 1 packets transmitted, 1 received, 0% packet loss, time 0ms >>>> rtt min/avg/max/mdev = 0.012/0.012/0.012/0.000 ms >>>> >>>> This also used to work equivalently for IPv6 local loopback routes: >>>> >>>> $ ip -6 route add local 2001:::/64 dev lo >>>> $ ping6 -c1 2001::9 >>>> PING 2001::9(2001::9) 56 data bytes >>>> 64 bytes from 2001::9: icmp_seq=1 ttl=64 time=0.010 ms >>>> >>>> --- 2001::9 ping statistics --- >>>> 1 packets transmitted, 1 received, 0% packet loss, time 0ms >>>> rtt min/avg/max/mdev = 0.010/0.010/0.010/0.000 ms >>>> >>>> However with this patch, this is very broken: >>>> >>>> $ ip -6 route add local 2001::/64 dev lo >>>> $ ping6 -c1 2001::9 >>>> PING 2001::9(2001::9) 56 data bytes >>>> ping: sendmsg: Invalid argument >>>> >>>> --- 2001::9 ping statistics --- >>>> 1 packets transmitted, 0 received, 100% packet loss, time 0ms >>> >>> Which kernel version are you using? Perhaps you miss another fix? It works >>> for >>> me. Also I cannot find this patch in net-next? >> >> It wasn't needed/applied as the route cache was removed. > > Do you mean the rt->n(eighbour) removal? There was no removal of a route cache > in IPv6 land. The cache is merely in the routing table itself. Yes, my bad, sorry. s/route/neighour/. It was discussed on this thread: http://article.gmane.org/gmane.linux.network/255318 "Note also that YOSHIFUJI Hideaki's patches to remove the cached neighbour entirely from ipv6 routes will have the same effect, so your patch won't be needed." Thanks, Marcelo -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 08, 2013 at 04:11:28PM -0300, Marcelo Ricardo Leitner wrote: > Em 08-08-2013 16:06, Hannes Frederic Sowa escreveu: > >On Thu, Aug 08, 2013 at 04:02:36PM -0300, Marcelo Ricardo Leitner wrote: > >>Em 08-08-2013 16:01, Hannes Frederic Sowa escreveu: > >>>On Thu, Aug 08, 2013 at 02:45:40PM -0400, Debabrata Banerjee wrote: > >>>>On Wed, Jan 30, 2013 at 3:26 AM, Jiri Pirko <jiri@resnulli.us> wrote: > >>>>>From: Marcelo Ricardo Leitner <mleitner@redhat.com> > >>>>I'm not sure this patch is doing the right thing. It seems to break > >>>>IPv6 loopback functionality, it is no longer equivalent to IPv4, as > >>>>stated above. It doesn't just stop neighbor creation but it stops > >>>>cached route creation. Seems like a scary change for a stable tree. > >>>>See below: > >>>> > >>>>$ ip -4 route show local > >>>>local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1 > >>>> > >>>>This local route enables us to use the whole loopback network, any > >>>>address inside 127.0.0.0/8 will work. > >>>> > >>>>$ ping -c1 127.0.0.9 > >>>>PING 127.0.0.9 (127.0.0.9) 56(84) bytes of data. > >>>>64 bytes from 127.0.0.9: icmp_seq=1 ttl=64 time=0.012 ms > >>>> > >>>>--- 127.0.0.9 ping statistics --- > >>>>1 packets transmitted, 1 received, 0% packet loss, time 0ms > >>>>rtt min/avg/max/mdev = 0.012/0.012/0.012/0.000 ms > >>>> > >>>>This also used to work equivalently for IPv6 local loopback routes: > >>>> > >>>>$ ip -6 route add local 2001:::/64 dev lo > >>>>$ ping6 -c1 2001::9 > >>>>PING 2001::9(2001::9) 56 data bytes > >>>>64 bytes from 2001::9: icmp_seq=1 ttl=64 time=0.010 ms > >>>> > >>>>--- 2001::9 ping statistics --- > >>>>1 packets transmitted, 1 received, 0% packet loss, time 0ms > >>>>rtt min/avg/max/mdev = 0.010/0.010/0.010/0.000 ms > >>>> > >>>>However with this patch, this is very broken: > >>>> > >>>>$ ip -6 route add local 2001::/64 dev lo > >>>>$ ping6 -c1 2001::9 > >>>>PING 2001::9(2001::9) 56 data bytes > >>>>ping: sendmsg: Invalid argument > >>>> > >>>>--- 2001::9 ping statistics --- > >>>>1 packets transmitted, 0 received, 100% packet loss, time 0ms > >>> > >>>Which kernel version are you using? Perhaps you miss another fix? It > >>>works > >>>for > >>>me. Also I cannot find this patch in net-next? > >> > >>It wasn't needed/applied as the route cache was removed. > > > >Do you mean the rt->n(eighbour) removal? There was no removal of a route > >cache > >in IPv6 land. The cache is merely in the routing table itself. > > Yes, my bad, sorry. s/route/neighour/. It was discussed on this thread: > http://article.gmane.org/gmane.linux.network/255318 > > "Note also that YOSHIFUJI Hideaki's patches to remove the cached neighbour > entirely from ipv6 routes will have the same effect, so your patch won't > be needed." Ok, thanks! But it somehow managed to get into stable kernels, nor? Kernels after rt->n removal should not be affected. At least the example above works on my net-next kernel correctly. Greetings, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 8, 2013 at 3:01 PM, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > > Which kernel version are you using? Perhaps you miss another fix? It works for > me. Also I cannot find this patch in net-next? > Just pulled and tried longterm 3.2.50, behavior is the same, broken. -Debabrata -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Em 08-08-2013 16:16, Hannes Frederic Sowa escreveu: > On Thu, Aug 08, 2013 at 04:11:28PM -0300, Marcelo Ricardo Leitner wrote: >> Em 08-08-2013 16:06, Hannes Frederic Sowa escreveu: >>> On Thu, Aug 08, 2013 at 04:02:36PM -0300, Marcelo Ricardo Leitner wrote: >>>> Em 08-08-2013 16:01, Hannes Frederic Sowa escreveu: >>>>> On Thu, Aug 08, 2013 at 02:45:40PM -0400, Debabrata Banerjee wrote: >>>>>> On Wed, Jan 30, 2013 at 3:26 AM, Jiri Pirko <jiri@resnulli.us> wrote: >>>>>>> From: Marcelo Ricardo Leitner <mleitner@redhat.com> >>>>>> I'm not sure this patch is doing the right thing. It seems to break >>>>>> IPv6 loopback functionality, it is no longer equivalent to IPv4, as >>>>>> stated above. It doesn't just stop neighbor creation but it stops >>>>>> cached route creation. Seems like a scary change for a stable tree. >>>>>> See below: >>>>>> >>>>>> $ ip -4 route show local >>>>>> local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1 >>>>>> >>>>>> This local route enables us to use the whole loopback network, any >>>>>> address inside 127.0.0.0/8 will work. >>>>>> >>>>>> $ ping -c1 127.0.0.9 >>>>>> PING 127.0.0.9 (127.0.0.9) 56(84) bytes of data. >>>>>> 64 bytes from 127.0.0.9: icmp_seq=1 ttl=64 time=0.012 ms >>>>>> >>>>>> --- 127.0.0.9 ping statistics --- >>>>>> 1 packets transmitted, 1 received, 0% packet loss, time 0ms >>>>>> rtt min/avg/max/mdev = 0.012/0.012/0.012/0.000 ms >>>>>> >>>>>> This also used to work equivalently for IPv6 local loopback routes: >>>>>> >>>>>> $ ip -6 route add local 2001:::/64 dev lo >>>>>> $ ping6 -c1 2001::9 >>>>>> PING 2001::9(2001::9) 56 data bytes >>>>>> 64 bytes from 2001::9: icmp_seq=1 ttl=64 time=0.010 ms >>>>>> >>>>>> --- 2001::9 ping statistics --- >>>>>> 1 packets transmitted, 1 received, 0% packet loss, time 0ms >>>>>> rtt min/avg/max/mdev = 0.010/0.010/0.010/0.000 ms >>>>>> >>>>>> However with this patch, this is very broken: >>>>>> >>>>>> $ ip -6 route add local 2001::/64 dev lo >>>>>> $ ping6 -c1 2001::9 >>>>>> PING 2001::9(2001::9) 56 data bytes >>>>>> ping: sendmsg: Invalid argument >>>>>> >>>>>> --- 2001::9 ping statistics --- >>>>>> 1 packets transmitted, 0 received, 100% packet loss, time 0ms >>>>> >>>>> Which kernel version are you using? Perhaps you miss another fix? It >>>>> works >>>>> for >>>>> me. Also I cannot find this patch in net-next? >>>> >>>> It wasn't needed/applied as the route cache was removed. >>> >>> Do you mean the rt->n(eighbour) removal? There was no removal of a route >>> cache >>> in IPv6 land. The cache is merely in the routing table itself. >> >> Yes, my bad, sorry. s/route/neighour/. It was discussed on this thread: >> http://article.gmane.org/gmane.linux.network/255318 >> >> "Note also that YOSHIFUJI Hideaki's patches to remove the cached neighbour >> entirely from ipv6 routes will have the same effect, so your patch won't >> be needed." > > Ok, thanks! > > But it somehow managed to get into stable kernels, nor? Kernels after rt->n > removal should not be affected. At least the example above works on my > net-next kernel correctly. Yes, it did, as a intermediate fix, let's say. As we wouldn't remove the cache for -stable tree, this patch seems reasonable to avoid creating a flood of non-wanted entries. Without it, when using TPROXY, it was creating neighbor entries for IP addresses that were behind a gateway. In case it helps: http://thread.gmane.org/gmane.linux.network/255234/focus=257293 http://article.gmane.org/gmane.linux.network/257433 (this thread, actually) Thanks, Marcelo -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 08, 2013 at 02:45:40PM -0400, Debabrata Banerjee wrote: > On Wed, Jan 30, 2013 at 3:26 AM, Jiri Pirko <jiri@resnulli.us> wrote: > > From: Marcelo Ricardo Leitner <mleitner@redhat.com> > > > > They will be created at output, if ever needed. This avoids creating > > empty neighbor entries when TPROXYing/Forwarding packets for addresses > > that are not even directly reachable. > > > > Note that IPv4 already handles it this way. No neighbor entries are > > created for local input. > > > > Tested by myself and customer. > > > > Signed-off-by: Jiri Pirko <jiri@resnulli.us> > > Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> > > --- > > net/ipv6/route.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/net/ipv6/route.c b/net/ipv6/route.c > > index e229a3b..363d8b7 100644 > > --- a/net/ipv6/route.c > > +++ b/net/ipv6/route.c > > @@ -928,7 +928,7 @@ restart: > > dst_hold(&rt->dst); > > read_unlock_bh(&table->tb6_lock); > > > > - if (!rt->n && !(rt->rt6i_flags & RTF_NONEXTHOP)) > > + if (!rt->n && !(rt->rt6i_flags & (RTF_NONEXTHOP | RTF_LOCAL))) > > nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr); > > else if (!(rt->dst.flags & DST_HOST)) > > nrt = rt6_alloc_clone(rt, &fl6->daddr); > > > > I'm not sure this patch is doing the right thing. It seems to break > IPv6 loopback functionality, it is no longer equivalent to IPv4, as > stated above. It doesn't just stop neighbor creation but it stops > cached route creation. Seems like a scary change for a stable tree. > See below: > > $ ip -4 route show local > local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1 > > This local route enables us to use the whole loopback network, any > address inside 127.0.0.0/8 will work. > > $ ping -c1 127.0.0.9 > PING 127.0.0.9 (127.0.0.9) 56(84) bytes of data. > 64 bytes from 127.0.0.9: icmp_seq=1 ttl=64 time=0.012 ms > > --- 127.0.0.9 ping statistics --- > 1 packets transmitted, 1 received, 0% packet loss, time 0ms > rtt min/avg/max/mdev = 0.012/0.012/0.012/0.000 ms > > This also used to work equivalently for IPv6 local loopback routes: > > $ ip -6 route add local 2001:::/64 dev lo > $ ping6 -c1 2001::9 > PING 2001::9(2001::9) 56 data bytes > 64 bytes from 2001::9: icmp_seq=1 ttl=64 time=0.010 ms > > --- 2001::9 ping statistics --- > 1 packets transmitted, 1 received, 0% packet loss, time 0ms > rtt min/avg/max/mdev = 0.010/0.010/0.010/0.000 ms > > However with this patch, this is very broken: > > $ ip -6 route add local 2001::/64 dev lo > $ ping6 -c1 2001::9 > PING 2001::9(2001::9) 56 data bytes > ping: sendmsg: Invalid argument I do think that the patch above is fine. I wonder why you get a blackhole route back here. Maybe backtracking in ip6_pol_route or in fib6_lookup_1 was way too aggressive? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 08, 2013 at 09:47:02PM +0200, Hannes Frederic Sowa wrote: > On Thu, Aug 08, 2013 at 02:45:40PM -0400, Debabrata Banerjee wrote: > > On Wed, Jan 30, 2013 at 3:26 AM, Jiri Pirko <jiri@resnulli.us> wrote: > > > From: Marcelo Ricardo Leitner <mleitner@redhat.com> > > > > > > They will be created at output, if ever needed. This avoids creating > > > empty neighbor entries when TPROXYing/Forwarding packets for addresses > > > that are not even directly reachable. > > > > > > Note that IPv4 already handles it this way. No neighbor entries are > > > created for local input. > > > > > > Tested by myself and customer. > > > > > > Signed-off-by: Jiri Pirko <jiri@resnulli.us> > > > Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> > > > --- > > > net/ipv6/route.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/net/ipv6/route.c b/net/ipv6/route.c > > > index e229a3b..363d8b7 100644 > > > --- a/net/ipv6/route.c > > > +++ b/net/ipv6/route.c > > > @@ -928,7 +928,7 @@ restart: > > > dst_hold(&rt->dst); > > > read_unlock_bh(&table->tb6_lock); > > > > > > - if (!rt->n && !(rt->rt6i_flags & RTF_NONEXTHOP)) > > > + if (!rt->n && !(rt->rt6i_flags & (RTF_NONEXTHOP | RTF_LOCAL))) > > > nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr); > > > else if (!(rt->dst.flags & DST_HOST)) > > > nrt = rt6_alloc_clone(rt, &fl6->daddr); > > > > > > > > I'm not sure this patch is doing the right thing. It seems to break > > IPv6 loopback functionality, it is no longer equivalent to IPv4, as > > stated above. It doesn't just stop neighbor creation but it stops > > cached route creation. Seems like a scary change for a stable tree. > > See below: > > > > $ ip -4 route show local > > local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1 > > > > This local route enables us to use the whole loopback network, any > > address inside 127.0.0.0/8 will work. > > > > $ ping -c1 127.0.0.9 > > PING 127.0.0.9 (127.0.0.9) 56(84) bytes of data. > > 64 bytes from 127.0.0.9: icmp_seq=1 ttl=64 time=0.012 ms > > > > --- 127.0.0.9 ping statistics --- > > 1 packets transmitted, 1 received, 0% packet loss, time 0ms > > rtt min/avg/max/mdev = 0.012/0.012/0.012/0.000 ms > > > > This also used to work equivalently for IPv6 local loopback routes: > > > > $ ip -6 route add local 2001:::/64 dev lo > > $ ping6 -c1 2001::9 > > PING 2001::9(2001::9) 56 data bytes > > 64 bytes from 2001::9: icmp_seq=1 ttl=64 time=0.010 ms > > > > --- 2001::9 ping statistics --- > > 1 packets transmitted, 1 received, 0% packet loss, time 0ms > > rtt min/avg/max/mdev = 0.010/0.010/0.010/0.000 ms > > > > However with this patch, this is very broken: > > > > $ ip -6 route add local 2001::/64 dev lo > > $ ping6 -c1 2001::9 > > PING 2001::9(2001::9) 56 data bytes > > ping: sendmsg: Invalid argument > > I do think that the patch above is fine. I wonder why you get a blackhole > route back here. Maybe backtracking in ip6_pol_route or in fib6_lookup_1 was > way too aggressive? Ah sorry, before rt->n removal everything worked a bit different. rt6_alloc_cow did fill rt->n back then. To fix both things we would have to bind a neighbour towards the loopback interface into the non-cloned rt6_info if it feeds packets towards lo. Pretty big change for old stable kernels, I guess. :/ Marcelo, any idea how to deal with this? My guess would be a revert, but I don't know the impact on the tproxy issue. Greetings, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Em 08-08-2013 17:16, Hannes Frederic Sowa escreveu: > On Thu, Aug 08, 2013 at 09:47:02PM +0200, Hannes Frederic Sowa wrote: >> On Thu, Aug 08, 2013 at 02:45:40PM -0400, Debabrata Banerjee wrote: >>> On Wed, Jan 30, 2013 at 3:26 AM, Jiri Pirko <jiri@resnulli.us> wrote: >>>> From: Marcelo Ricardo Leitner <mleitner@redhat.com> >>>> >>>> They will be created at output, if ever needed. This avoids creating >>>> empty neighbor entries when TPROXYing/Forwarding packets for addresses >>>> that are not even directly reachable. >>>> >>>> Note that IPv4 already handles it this way. No neighbor entries are >>>> created for local input. >>>> >>>> Tested by myself and customer. >>>> >>>> Signed-off-by: Jiri Pirko <jiri@resnulli.us> >>>> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> >>>> --- >>>> net/ipv6/route.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/net/ipv6/route.c b/net/ipv6/route.c >>>> index e229a3b..363d8b7 100644 >>>> --- a/net/ipv6/route.c >>>> +++ b/net/ipv6/route.c >>>> @@ -928,7 +928,7 @@ restart: >>>> dst_hold(&rt->dst); >>>> read_unlock_bh(&table->tb6_lock); >>>> >>>> - if (!rt->n && !(rt->rt6i_flags & RTF_NONEXTHOP)) >>>> + if (!rt->n && !(rt->rt6i_flags & (RTF_NONEXTHOP | RTF_LOCAL))) >>>> nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr); >>>> else if (!(rt->dst.flags & DST_HOST)) >>>> nrt = rt6_alloc_clone(rt, &fl6->daddr); >>> >>> >>> >>> I'm not sure this patch is doing the right thing. It seems to break >>> IPv6 loopback functionality, it is no longer equivalent to IPv4, as >>> stated above. It doesn't just stop neighbor creation but it stops >>> cached route creation. Seems like a scary change for a stable tree. >>> See below: >>> >>> $ ip -4 route show local >>> local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1 >>> >>> This local route enables us to use the whole loopback network, any >>> address inside 127.0.0.0/8 will work. >>> >>> $ ping -c1 127.0.0.9 >>> PING 127.0.0.9 (127.0.0.9) 56(84) bytes of data. >>> 64 bytes from 127.0.0.9: icmp_seq=1 ttl=64 time=0.012 ms >>> >>> --- 127.0.0.9 ping statistics --- >>> 1 packets transmitted, 1 received, 0% packet loss, time 0ms >>> rtt min/avg/max/mdev = 0.012/0.012/0.012/0.000 ms >>> >>> This also used to work equivalently for IPv6 local loopback routes: >>> >>> $ ip -6 route add local 2001:::/64 dev lo >>> $ ping6 -c1 2001::9 >>> PING 2001::9(2001::9) 56 data bytes >>> 64 bytes from 2001::9: icmp_seq=1 ttl=64 time=0.010 ms >>> >>> --- 2001::9 ping statistics --- >>> 1 packets transmitted, 1 received, 0% packet loss, time 0ms >>> rtt min/avg/max/mdev = 0.010/0.010/0.010/0.000 ms >>> >>> However with this patch, this is very broken: >>> >>> $ ip -6 route add local 2001::/64 dev lo >>> $ ping6 -c1 2001::9 >>> PING 2001::9(2001::9) 56 data bytes >>> ping: sendmsg: Invalid argument >> >> I do think that the patch above is fine. I wonder why you get a blackhole >> route back here. Maybe backtracking in ip6_pol_route or in fib6_lookup_1 was >> way too aggressive? > > Ah sorry, before rt->n removal everything worked a bit > different. rt6_alloc_cow did fill rt->n back then. To fix both things > we would have to bind a neighbour towards the loopback interface into > the non-cloned rt6_info if it feeds packets towards lo. Pretty big change for > old stable kernels, I guess. :/ > > Marcelo, any idea how to deal with this? My guess would be a revert, but I > don't know the impact on the tproxy issue. Good question :) Nothing so far, sorry. The impact would be returning to the previous state, that a tproxy server is limited to neighbor cache size. And just making it larger is not a good option as it will introduce big latency spikes during cleanup. I'll have to rebuild the tproxy environment I had to test this out again, it will take a while. Keep you posted. Cheers, Marcelo -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Em 08-08-2013 17:45, Marcelo Ricardo Leitner escreveu: > Em 08-08-2013 17:16, Hannes Frederic Sowa escreveu: >> On Thu, Aug 08, 2013 at 09:47:02PM +0200, Hannes Frederic Sowa wrote: >>> On Thu, Aug 08, 2013 at 02:45:40PM -0400, Debabrata Banerjee wrote: >>>> On Wed, Jan 30, 2013 at 3:26 AM, Jiri Pirko <jiri@resnulli.us> wrote: >>>>> From: Marcelo Ricardo Leitner <mleitner@redhat.com> >>>>> >>>>> They will be created at output, if ever needed. This avoids creating >>>>> empty neighbor entries when TPROXYing/Forwarding packets for addresses >>>>> that are not even directly reachable. >>>>> >>>>> Note that IPv4 already handles it this way. No neighbor entries are >>>>> created for local input. >>>>> >>>>> Tested by myself and customer. >>>>> >>>>> Signed-off-by: Jiri Pirko <jiri@resnulli.us> >>>>> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> >>>>> --- >>>>> net/ipv6/route.c | 2 +- >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>> >>>>> diff --git a/net/ipv6/route.c b/net/ipv6/route.c >>>>> index e229a3b..363d8b7 100644 >>>>> --- a/net/ipv6/route.c >>>>> +++ b/net/ipv6/route.c >>>>> @@ -928,7 +928,7 @@ restart: >>>>> dst_hold(&rt->dst); >>>>> read_unlock_bh(&table->tb6_lock); >>>>> >>>>> - if (!rt->n && !(rt->rt6i_flags & RTF_NONEXTHOP)) >>>>> + if (!rt->n && !(rt->rt6i_flags & (RTF_NONEXTHOP | RTF_LOCAL))) >>>>> nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr); >>>>> else if (!(rt->dst.flags & DST_HOST)) >>>>> nrt = rt6_alloc_clone(rt, &fl6->daddr); >>>> >>>> >>>> >>>> I'm not sure this patch is doing the right thing. It seems to break >>>> IPv6 loopback functionality, it is no longer equivalent to IPv4, as >>>> stated above. It doesn't just stop neighbor creation but it stops >>>> cached route creation. Seems like a scary change for a stable tree. >>>> See below: >>>> >>>> $ ip -4 route show local >>>> local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1 >>>> >>>> This local route enables us to use the whole loopback network, any >>>> address inside 127.0.0.0/8 will work. >>>> >>>> $ ping -c1 127.0.0.9 >>>> PING 127.0.0.9 (127.0.0.9) 56(84) bytes of data. >>>> 64 bytes from 127.0.0.9: icmp_seq=1 ttl=64 time=0.012 ms >>>> >>>> --- 127.0.0.9 ping statistics --- >>>> 1 packets transmitted, 1 received, 0% packet loss, time 0ms >>>> rtt min/avg/max/mdev = 0.012/0.012/0.012/0.000 ms >>>> >>>> This also used to work equivalently for IPv6 local loopback routes: >>>> >>>> $ ip -6 route add local 2001:::/64 dev lo >>>> $ ping6 -c1 2001::9 >>>> PING 2001::9(2001::9) 56 data bytes >>>> 64 bytes from 2001::9: icmp_seq=1 ttl=64 time=0.010 ms >>>> >>>> --- 2001::9 ping statistics --- >>>> 1 packets transmitted, 1 received, 0% packet loss, time 0ms >>>> rtt min/avg/max/mdev = 0.010/0.010/0.010/0.000 ms >>>> >>>> However with this patch, this is very broken: >>>> >>>> $ ip -6 route add local 2001::/64 dev lo >>>> $ ping6 -c1 2001::9 >>>> PING 2001::9(2001::9) 56 data bytes >>>> ping: sendmsg: Invalid argument >>> >>> I do think that the patch above is fine. I wonder why you get a blackhole >>> route back here. Maybe backtracking in ip6_pol_route or in fib6_lookup_1 was >>> way too aggressive? >> >> Ah sorry, before rt->n removal everything worked a bit >> different. rt6_alloc_cow did fill rt->n back then. To fix both things >> we would have to bind a neighbour towards the loopback interface into >> the non-cloned rt6_info if it feeds packets towards lo. Pretty big change for >> old stable kernels, I guess. :/ >> >> Marcelo, any idea how to deal with this? My guess would be a revert, but I >> don't know the impact on the tproxy issue. > > Good question :) Nothing so far, sorry. > > The impact would be returning to the previous state, that a tproxy server is > limited to neighbor cache size. And just making it larger is not a good option > as it will introduce big latency spikes during cleanup. > > I'll have to rebuild the tproxy environment I had to test this out again, it > will take a while. Keep you posted. Aye, and thanks for assisting on this, Hannes, appreciated. Cheers, Marcelo -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv6/route.c b/net/ipv6/route.c index e229a3b..363d8b7 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -928,7 +928,7 @@ restart: dst_hold(&rt->dst); read_unlock_bh(&table->tb6_lock); - if (!rt->n && !(rt->rt6i_flags & RTF_NONEXTHOP)) + if (!rt->n && !(rt->rt6i_flags & (RTF_NONEXTHOP | RTF_LOCAL))) nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr); else if (!(rt->dst.flags & DST_HOST)) nrt = rt6_alloc_clone(rt, &fl6->daddr);