diff mbox series

[net] ipv6: fix neighbour resolution with raw socket

Message ID 20190620123434.7219-1-nicolas.dichtel@6wind.com
State Changes Requested
Delegated to: David Miller
Headers show
Series [net] ipv6: fix neighbour resolution with raw socket | expand

Commit Message

Nicolas Dichtel June 20, 2019, 12:34 p.m. UTC
The scenario is the following: the user uses a raw socket to send an ipv6
packet, destinated to a not-connected network, and specify a connected nh.
Here is the corresponding python script to reproduce this scenario:

 import socket
 IPPROTO_RAW = 255
 send_s = socket.socket(socket.AF_INET6, socket.SOCK_RAW, IPPROTO_RAW)
 # scapy
 # p = IPv6(src='fd00:100::1', dst='fd00:200::fa')/ICMPv6EchoRequest()
 # str(p)
 req = b'`\x00\x00\x00\x00\x08:@\xfd\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xfd\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xfa\x80\x00\x81\xc0\x00\x00\x00\x00'
 send_s.sendto(req, ('fd00:175::2', 0, 0, 0))

fd00:175::/64 is a connected route and fd00:200::fa is not a connected
host.

With this scenario, the kernel starts by sending a NS to resolve
fd00:175::2. When it receives the NA, it flushes its queue and try to send
the initial packet. But instead of sending it, it sends another NS to
resolve fd00:200::fa, which obvioulsy fails, thus the packet is dropped. If
the user sends again the packet, it now uses the right nh (fd00:175::2).

The problem is that ip6_dst_lookup_neigh() uses the rt6i_gateway, which is
:: because the associated route is a connected route, thus it uses the dst
addr of the packet. Let's use rt6_nexthop() to choose the right nh.

Note that rt and in6addr_any are const in ip6_dst_lookup_neigh(), thus
let's constify rt6_nexthop() to avoid ugly cast.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 include/net/ip6_route.h | 4 ++--
 net/ipv6/ip6_output.c   | 2 +-
 net/ipv6/route.c        | 3 ++-
 3 files changed, 5 insertions(+), 4 deletions(-)

Comments

David Ahern June 20, 2019, 3:12 p.m. UTC | #1
On 6/20/19 6:34 AM, Nicolas Dichtel wrote:
> The scenario is the following: the user uses a raw socket to send an ipv6
> packet, destinated to a not-connected network, and specify a connected nh.
> Here is the corresponding python script to reproduce this scenario:
> 
>  import socket
>  IPPROTO_RAW = 255
>  send_s = socket.socket(socket.AF_INET6, socket.SOCK_RAW, IPPROTO_RAW)
>  # scapy
>  # p = IPv6(src='fd00:100::1', dst='fd00:200::fa')/ICMPv6EchoRequest()
>  # str(p)
>  req = b'`\x00\x00\x00\x00\x08:@\xfd\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xfd\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xfa\x80\x00\x81\xc0\x00\x00\x00\x00'
>  send_s.sendto(req, ('fd00:175::2', 0, 0, 0))
> 
> fd00:175::/64 is a connected route and fd00:200::fa is not a connected
> host.
> 
> With this scenario, the kernel starts by sending a NS to resolve
> fd00:175::2. When it receives the NA, it flushes its queue and try to send
> the initial packet. But instead of sending it, it sends another NS to
> resolve fd00:200::fa, which obvioulsy fails, thus the packet is dropped. If
> the user sends again the packet, it now uses the right nh (fd00:175::2).
> 

what's the local address and route setup? You reference fd00:100::1 and
fd00:200::fa with connected route fd00:175::/64.
Nicolas Dichtel June 20, 2019, 3:42 p.m. UTC | #2
Le 20/06/2019 à 17:12, David Ahern a écrit :
> On 6/20/19 6:34 AM, Nicolas Dichtel wrote:
>> The scenario is the following: the user uses a raw socket to send an ipv6
>> packet, destinated to a not-connected network, and specify a connected nh.
>> Here is the corresponding python script to reproduce this scenario:
>>
>>  import socket
>>  IPPROTO_RAW = 255
>>  send_s = socket.socket(socket.AF_INET6, socket.SOCK_RAW, IPPROTO_RAW)
>>  # scapy
>>  # p = IPv6(src='fd00:100::1', dst='fd00:200::fa')/ICMPv6EchoRequest()
>>  # str(p)
>>  req = b'`\x00\x00\x00\x00\x08:@\xfd\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xfd\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xfa\x80\x00\x81\xc0\x00\x00\x00\x00'
>>  send_s.sendto(req, ('fd00:175::2', 0, 0, 0))
>>
>> fd00:175::/64 is a connected route and fd00:200::fa is not a connected
>> host.
>>
>> With this scenario, the kernel starts by sending a NS to resolve
>> fd00:175::2. When it receives the NA, it flushes its queue and try to send
>> the initial packet. But instead of sending it, it sends another NS to
>> resolve fd00:200::fa, which obvioulsy fails, thus the packet is dropped. If
>> the user sends again the packet, it now uses the right nh (fd00:175::2).
>>
> 
> what's the local address and route setup? You reference fd00:100::1 and
> fd00:200::fa with connected route fd00:175::/64.
> 

The test in done on the dut:

    +-----+             +------+             +------+             +-----+
    | tnl |             | dut  |.1         .2|router|             | tnr |
    |     |             |     2+-------------+2     |             |     |
    |     |.1         .2|      |fd00:125::/64|      |.2         .1|     |
    |    1+-------------+1     |             |     1+-------------+1    |
    |     |fd00:100::/64|      |             |      |fd00:200::/64|     |
    |     |             |      |.1         .2|      |             |     |
    |     |             |     3+-------------+3     |             |     |
    |     |             |      |fd00:175::/64|      |             |     |
    +-----+             +------+             +------+             +-----+

On dut:
ip address add fd00:100::2/64 dev ntfp1
ip address add fd00:125::1/64 dev ntfp2
ip address add fd00:175::1/64 dev ntfp3
ip route add fd00:200::/64 via fd00:125::2
ip route add fd00:200::/120 nexthop via fd00:125::2 nexthop via fd00:175::2

Note that fd00:200::fa is not reachable but we expect to see the packet on the
host 'router'.
David Ahern June 20, 2019, 4:36 p.m. UTC | #3
On 6/20/19 9:42 AM, Nicolas Dichtel wrote:
> Le 20/06/2019 à 17:12, David Ahern a écrit :
>> On 6/20/19 6:34 AM, Nicolas Dichtel wrote:
>>> The scenario is the following: the user uses a raw socket to send an ipv6
>>> packet, destinated to a not-connected network, and specify a connected nh.
>>> Here is the corresponding python script to reproduce this scenario:
>>>
>>>  import socket
>>>  IPPROTO_RAW = 255
>>>  send_s = socket.socket(socket.AF_INET6, socket.SOCK_RAW, IPPROTO_RAW)
>>>  # scapy
>>>  # p = IPv6(src='fd00:100::1', dst='fd00:200::fa')/ICMPv6EchoRequest()
>>>  # str(p)
>>>  req = b'`\x00\x00\x00\x00\x08:@\xfd\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xfd\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xfa\x80\x00\x81\xc0\x00\x00\x00\x00'
>>>  send_s.sendto(req, ('fd00:175::2', 0, 0, 0))
>>>
>>> fd00:175::/64 is a connected route and fd00:200::fa is not a connected
>>> host.
>>>
>>> With this scenario, the kernel starts by sending a NS to resolve
>>> fd00:175::2. When it receives the NA, it flushes its queue and try to send
>>> the initial packet. But instead of sending it, it sends another NS to
>>> resolve fd00:200::fa, which obvioulsy fails, thus the packet is dropped. If
>>> the user sends again the packet, it now uses the right nh (fd00:175::2).
>>>
>>
>> what's the local address and route setup? You reference fd00:100::1 and
>> fd00:200::fa with connected route fd00:175::/64.
>>
> 
> The test in done on the dut:
> 
>     +-----+             +------+             +------+             +-----+
>     | tnl |             | dut  |.1         .2|router|             | tnr |
>     |     |             |     2+-------------+2     |             |     |
>     |     |.1         .2|      |fd00:125::/64|      |.2         .1|     |
>     |    1+-------------+1     |             |     1+-------------+1    |
>     |     |fd00:100::/64|      |             |      |fd00:200::/64|     |
>     |     |             |      |.1         .2|      |             |     |
>     |     |             |     3+-------------+3     |             |     |
>     |     |             |      |fd00:175::/64|      |             |     |
>     +-----+             +------+             +------+             +-----+
> 
> On dut:
> ip address add fd00:100::2/64 dev ntfp1
> ip address add fd00:125::1/64 dev ntfp2
> ip address add fd00:175::1/64 dev ntfp3
> ip route add fd00:200::/64 via fd00:125::2
> ip route add fd00:200::/120 nexthop via fd00:125::2 nexthop via fd00:175::2
> 
> Note that fd00:200::fa is not reachable but we expect to see the packet on the
> host 'router'.
> 

gotcha. Thanks for the diagram.

Reviewed-by: David Ahern <dsahern@gmail.com>

You don't have a fixes tag, but this should go to stable releases.

Also, this does not fix the forwarding case. For the forwarding case I
still see it trying to resolve fd00:200::fa from dut.

Namespace version of the above setup:

create_ns() {
        local ns=$1
        ip netns add ${ns}
        ip -netns ${ns} link set lo up
        ip netns exec ${ns} sysctl -qw net.ipv4.ip_forward=1
        ip netns exec ${ns} sysctl -qw net.ipv6.conf.all.keep_addr_on_down=1
        ip netns exec ${ns} sysctl -qw net.ipv6.conf.all.forwarding=1
        ip netns exec ${ns} sysctl -qw net.ipv6.conf.default.forwarding=1
}

create_ns tnl
create_ns dut
create_ns router
create_ns tnr

ip -netns tnl li add eth1 type veth peer name eth1d
ip -netns tnl li set eth1d netns dut name eth1
ip -netns tnl li set eth1 up
ip -netns tnl addr add fd00:100::1/64 dev eth1
ip -netns tnl -6 ro add default via fd00:100::2

ip -netns dut li set eth1 up
ip -netns dut addr add fd00:100::2/64 dev eth1

ip -netns dut li add eth2 type veth peer name eth2r
ip -netns dut li set eth2r netns router name eth2
ip -netns dut li set eth2 up
ip -netns dut addr add dev eth2 fd00:125::1/64
ip -netns router  li set eth2 up
ip -netns router addr add dev eth2 fd00:125::2/64

ip -netns dut li add eth3 type veth peer name eth3r
ip -netns dut li set eth3r netns router name eth3
ip -netns dut li set eth3 up
ip -netns dut addr add dev eth3 fd00:175::1/64
ip -netns router li set eth3 up
ip -netns router addr add dev eth3 fd00:175::2/64

ip -netns router li add eth1 type veth peer name eth1t
ip -netns router li set eth1t netns tnr name eth1
ip -netns router li set eth1 up
ip -netns router addr add dev eth1 fd00:200::2/64
ip -netns tnr li set eth1 up
ip -netns tnr addr add dev eth1 fd00:200::1/64

ip -netns dut route add fd00:200::/64 via fd00:125::2
ip -netns dut route add fd00:200::/120 nexthop via fd00:125::2 nexthop
via fd00:175::2
David Ahern June 20, 2019, 4:47 p.m. UTC | #4
On 6/20/19 10:36 AM, David Ahern wrote:
> 
> Also, this does not fix the forwarding case. For the forwarding case I
> still see it trying to resolve fd00:200::fa from dut.

nevermind. I used the wrong address for the forwarding case.
Nicolas Dichtel June 21, 2019, 8:09 a.m. UTC | #5
Le 20/06/2019 à 18:36, David Ahern a écrit :
[snip]
> You don't have a fixes tag, but this should go to stable releases.
Yeah, I was not able to point a specific commit. The bug is reproducible with a
4.4 from ubuntu-16.04, with a 3.10 from redhat-7 but not with a vanilla 3.10.20.
David Miller June 23, 2019, 12:07 a.m. UTC | #6
From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Thu, 20 Jun 2019 14:34:34 +0200

> The scenario is the following: the user uses a raw socket to send an ipv6
> packet, destinated to a not-connected network, and specify a connected nh.
> Here is the corresponding python script to reproduce this scenario:
 ...
> fd00:175::/64 is a connected route and fd00:200::fa is not a connected
> host.
> 
> With this scenario, the kernel starts by sending a NS to resolve
> fd00:175::2. When it receives the NA, it flushes its queue and try to send
> the initial packet. But instead of sending it, it sends another NS to
> resolve fd00:200::fa, which obvioulsy fails, thus the packet is dropped. If
> the user sends again the packet, it now uses the right nh (fd00:175::2).
> 
> The problem is that ip6_dst_lookup_neigh() uses the rt6i_gateway, which is
> :: because the associated route is a connected route, thus it uses the dst
> addr of the packet. Let's use rt6_nexthop() to choose the right nh.
> 
> Note that rt and in6addr_any are const in ip6_dst_lookup_neigh(), thus
> let's constify rt6_nexthop() to avoid ugly cast.
> 
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>

Applied and queued up for -stable, thanks.
David Miller June 23, 2019, 12:08 a.m. UTC | #7
From: David Miller <davem@davemloft.net>
Date: Sat, 22 Jun 2019 17:07:12 -0700 (PDT)

> Applied and queued up for -stable, thanks.

Actually, this needs a warning fix in bluetooth and netfilter.  Please
fix this up, do a proper allmodconfig build, and resubmit.

net/bluetooth/6lowpan.c: In function ‘peer_lookup_dst’:
net/bluetooth/6lowpan.c:188:11: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
net/netfilter/nf_flow_table_ip.c: In function ‘nf_flow_offload_ipv6_hook’:
net/netfilter/nf_flow_table_ip.c:481:10: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]

Thank you.
diff mbox series

Patch

diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
index 4790beaa86e0..ee7405e759ba 100644
--- a/include/net/ip6_route.h
+++ b/include/net/ip6_route.h
@@ -262,8 +262,8 @@  static inline bool ip6_sk_ignore_df(const struct sock *sk)
 	       inet6_sk(sk)->pmtudisc == IPV6_PMTUDISC_OMIT;
 }
 
-static inline struct in6_addr *rt6_nexthop(struct rt6_info *rt,
-					   struct in6_addr *daddr)
+static inline const struct in6_addr *rt6_nexthop(const struct rt6_info *rt,
+						 const struct in6_addr *daddr)
 {
 	if (rt->rt6i_flags & RTF_GATEWAY)
 		return &rt->rt6i_gateway;
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 834475717110..21efcd02f337 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -59,8 +59,8 @@  static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff *
 {
 	struct dst_entry *dst = skb_dst(skb);
 	struct net_device *dev = dst->dev;
+	const struct in6_addr *nexthop;
 	struct neighbour *neigh;
-	struct in6_addr *nexthop;
 	int ret;
 
 	if (ipv6_addr_is_multicast(&ipv6_hdr(skb)->daddr)) {
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 11ad62effd56..b6449bc03f11 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -218,7 +218,8 @@  static struct neighbour *ip6_dst_neigh_lookup(const struct dst_entry *dst,
 {
 	const struct rt6_info *rt = container_of(dst, struct rt6_info, dst);
 
-	return ip6_neigh_lookup(&rt->rt6i_gateway, dst->dev, skb, daddr);
+	return ip6_neigh_lookup(rt6_nexthop(rt, &in6addr_any),
+				dst->dev, skb, daddr);
 }
 
 static void ip6_confirm_neigh(const struct dst_entry *dst, const void *daddr)