diff mbox series

[net] ipv6: fixes rt6_probe() and fib6_nh->last_probe init

Message ID 20191107024509.87121-1-edumazet@google.com
State Superseded
Delegated to: David Miller
Headers show
Series [net] ipv6: fixes rt6_probe() and fib6_nh->last_probe init | expand

Commit Message

Eric Dumazet Nov. 7, 2019, 2:45 a.m. UTC
While looking at a syzbot KCSAN report [1], I found multiple
issues in this code :

1) fib6_nh->last_probe has an initial value of 0.

   While probably okay on 64bit kernels, this causes an issue
   on 32bit kernels since the time_after(jiffies, 0 + interval)
   might be false ~24 days after boot (for HZ=1000)

2) The data-race found by KCSAN
   I could use READ_ONCE() and WRITE_ONCE(), but we also can
   take the opportunity of not piling-up too many rt6_probe_deferred()
   works by using instead cmpxchg() so that only one cpu wins the race.

[1]
BUG: KCSAN: data-race in find_match / find_match

write to 0xffff8880bb7aabe8 of 8 bytes by interrupt on cpu 1:
 rt6_probe net/ipv6/route.c:663 [inline]
 find_match net/ipv6/route.c:757 [inline]
 find_match+0x5bd/0x790 net/ipv6/route.c:733
 __find_rr_leaf+0xe3/0x780 net/ipv6/route.c:831
 find_rr_leaf net/ipv6/route.c:852 [inline]
 rt6_select net/ipv6/route.c:896 [inline]
 fib6_table_lookup+0x383/0x650 net/ipv6/route.c:2164
 ip6_pol_route+0xee/0x5c0 net/ipv6/route.c:2200
 ip6_pol_route_output+0x48/0x60 net/ipv6/route.c:2452
 fib6_rule_lookup+0x3d6/0x470 net/ipv6/fib6_rules.c:117
 ip6_route_output_flags_noref+0x16b/0x230 net/ipv6/route.c:2484
 ip6_route_output_flags+0x50/0x1a0 net/ipv6/route.c:2497
 ip6_dst_lookup_tail+0x25d/0xc30 net/ipv6/ip6_output.c:1049
 ip6_dst_lookup_flow+0x68/0x120 net/ipv6/ip6_output.c:1150
 inet6_csk_route_socket+0x2f7/0x420 net/ipv6/inet6_connection_sock.c:106
 inet6_csk_xmit+0x91/0x1f0 net/ipv6/inet6_connection_sock.c:121
 __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169
 tcp_transmit_skb net/ipv4/tcp_output.c:1185 [inline]
 tcp_xmit_probe_skb+0x19b/0x1d0 net/ipv4/tcp_output.c:3735

read to 0xffff8880bb7aabe8 of 8 bytes by interrupt on cpu 0:
 rt6_probe net/ipv6/route.c:657 [inline]
 find_match net/ipv6/route.c:757 [inline]
 find_match+0x521/0x790 net/ipv6/route.c:733
 __find_rr_leaf+0xe3/0x780 net/ipv6/route.c:831
 find_rr_leaf net/ipv6/route.c:852 [inline]
 rt6_select net/ipv6/route.c:896 [inline]
 fib6_table_lookup+0x383/0x650 net/ipv6/route.c:2164
 ip6_pol_route+0xee/0x5c0 net/ipv6/route.c:2200
 ip6_pol_route_output+0x48/0x60 net/ipv6/route.c:2452
 fib6_rule_lookup+0x3d6/0x470 net/ipv6/fib6_rules.c:117
 ip6_route_output_flags_noref+0x16b/0x230 net/ipv6/route.c:2484
 ip6_route_output_flags+0x50/0x1a0 net/ipv6/route.c:2497
 ip6_dst_lookup_tail+0x25d/0xc30 net/ipv6/ip6_output.c:1049
 ip6_dst_lookup_flow+0x68/0x120 net/ipv6/ip6_output.c:1150
 inet6_csk_route_socket+0x2f7/0x420 net/ipv6/inet6_connection_sock.c:106
 inet6_csk_xmit+0x91/0x1f0 net/ipv6/inet6_connection_sock.c:121
 __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 18894 Comm: udevd Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: cc3a86c802f0 ("ipv6: Change rt6_probe to take a fib6_nh")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: David Ahern <dsahern@gmail.com>
---
 net/ipv6/route.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

Comments

David Ahern Nov. 7, 2019, 4:37 p.m. UTC | #1
On 11/6/19 7:45 PM, Eric Dumazet wrote:
> While looking at a syzbot KCSAN report [1], I found multiple
> issues in this code :
> 
> 1) fib6_nh->last_probe has an initial value of 0.
> 
>    While probably okay on 64bit kernels, this causes an issue
>    on 32bit kernels since the time_after(jiffies, 0 + interval)
>    might be false ~24 days after boot (for HZ=1000)
> 
> 2) The data-race found by KCSAN
>    I could use READ_ONCE() and WRITE_ONCE(), but we also can
>    take the opportunity of not piling-up too many rt6_probe_deferred()
>    works by using instead cmpxchg() so that only one cpu wins the race.
> 

...

> Fixes: cc3a86c802f0 ("ipv6: Change rt6_probe to take a fib6_nh")

That commit only moves the location of last_probe, from fib6_info into
fib6_nh. Given that I would expect the same problem to exist with the
previous code. Agree? Point being should this be backported to older
stable releases since said commit is new to 5.2?
Eric Dumazet Nov. 7, 2019, 4:43 p.m. UTC | #2
On Thu, Nov 7, 2019 at 8:37 AM David Ahern <dsahern@gmail.com> wrote:
>
> On 11/6/19 7:45 PM, Eric Dumazet wrote:
> > While looking at a syzbot KCSAN report [1], I found multiple
> > issues in this code :
> >
> > 1) fib6_nh->last_probe has an initial value of 0.
> >
> >    While probably okay on 64bit kernels, this causes an issue
> >    on 32bit kernels since the time_after(jiffies, 0 + interval)
> >    might be false ~24 days after boot (for HZ=1000)
> >
> > 2) The data-race found by KCSAN
> >    I could use READ_ONCE() and WRITE_ONCE(), but we also can
> >    take the opportunity of not piling-up too many rt6_probe_deferred()
> >    works by using instead cmpxchg() so that only one cpu wins the race.
> >
>
> ...
>
> > Fixes: cc3a86c802f0 ("ipv6: Change rt6_probe to take a fib6_nh")
>
> That commit only moves the location of last_probe, from fib6_info into
> fib6_nh. Given that I would expect the same problem to exist with the
> previous code. Agree? Point being should this be backported to older
> stable releases since said commit is new to 5.2?
Eric Dumazet Nov. 7, 2019, 4:45 p.m. UTC | #3
On Thu, Nov 7, 2019 at 8:37 AM David Ahern <dsahern@gmail.com> wrote:
>
> On 11/6/19 7:45 PM, Eric Dumazet wrote:
> > While looking at a syzbot KCSAN report [1], I found multiple
> > issues in this code :
> >
> > 1) fib6_nh->last_probe has an initial value of 0.
> >
> >    While probably okay on 64bit kernels, this causes an issue
> >    on 32bit kernels since the time_after(jiffies, 0 + interval)
> >    might be false ~24 days after boot (for HZ=1000)
> >
> > 2) The data-race found by KCSAN
> >    I could use READ_ONCE() and WRITE_ONCE(), but we also can
> >    take the opportunity of not piling-up too many rt6_probe_deferred()
> >    works by using instead cmpxchg() so that only one cpu wins the race.
> >
>
> ...
>
> > Fixes: cc3a86c802f0 ("ipv6: Change rt6_probe to take a fib6_nh")
>
> That commit only moves the location of last_probe, from fib6_info into
> fib6_nh. Given that I would expect the same problem to exist with the
> previous code. Agree? Point being should this be backported to older
> stable releases since said commit is new to 5.2?

Yes, the commit adding last probe went in 4.19

Fixes: f547fac624be ("ipv6: rate-limit probes for neighbourless routes")

Thanks.
David Ahern Nov. 7, 2019, 4:50 p.m. UTC | #4
On 11/7/19 9:45 AM, Eric Dumazet wrote:
> On Thu, Nov 7, 2019 at 8:37 AM David Ahern <dsahern@gmail.com> wrote:
>>
>> On 11/6/19 7:45 PM, Eric Dumazet wrote:
>>> While looking at a syzbot KCSAN report [1], I found multiple
>>> issues in this code :
>>>
>>> 1) fib6_nh->last_probe has an initial value of 0.
>>>
>>>    While probably okay on 64bit kernels, this causes an issue
>>>    on 32bit kernels since the time_after(jiffies, 0 + interval)
>>>    might be false ~24 days after boot (for HZ=1000)
>>>
>>> 2) The data-race found by KCSAN
>>>    I could use READ_ONCE() and WRITE_ONCE(), but we also can
>>>    take the opportunity of not piling-up too many rt6_probe_deferred()
>>>    works by using instead cmpxchg() so that only one cpu wins the race.
>>>
>>
>> ...
>>
>>> Fixes: cc3a86c802f0 ("ipv6: Change rt6_probe to take a fib6_nh")
>>
>> That commit only moves the location of last_probe, from fib6_info into
>> fib6_nh. Given that I would expect the same problem to exist with the
>> previous code. Agree? Point being should this be backported to older
>> stable releases since said commit is new to 5.2?
> 
> Yes, the commit adding last probe went in 4.19
> 
> Fixes: f547fac624be ("ipv6: rate-limit probes for neighbourless routes")
> 

Reviewed-by: David Ahern <dsahern@gmail.com>
Eric Dumazet Nov. 7, 2019, 5:26 p.m. UTC | #5
On Thu, Nov 7, 2019 at 8:50 AM David Ahern <dsahern@gmail.com> wrote:
>

> Reviewed-by: David Ahern <dsahern@gmail.com>

Great, I 'll send a V2 for David Miller convenience (I do not believe
additional Fixes: tag are taken by patchwork)
diff mbox series

Patch

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index a63ff85fe14198fc23a5cbc7abcd107df5df00c8..e60bf8e7dd1aa8dafe8981ecef633e16e1a52d8d 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -621,6 +621,7 @@  static void rt6_probe(struct fib6_nh *fib6_nh)
 {
 	struct __rt6_probe_work *work = NULL;
 	const struct in6_addr *nh_gw;
+	unsigned long last_probe;
 	struct neighbour *neigh;
 	struct net_device *dev;
 	struct inet6_dev *idev;
@@ -639,6 +640,7 @@  static void rt6_probe(struct fib6_nh *fib6_nh)
 	nh_gw = &fib6_nh->fib_nh_gw6;
 	dev = fib6_nh->fib_nh_dev;
 	rcu_read_lock_bh();
+	last_probe = READ_ONCE(fib6_nh->last_probe);
 	idev = __in6_dev_get(dev);
 	neigh = __ipv6_neigh_lookup_noref(dev, nh_gw);
 	if (neigh) {
@@ -654,13 +656,15 @@  static void rt6_probe(struct fib6_nh *fib6_nh)
 				__neigh_set_probe_once(neigh);
 		}
 		write_unlock(&neigh->lock);
-	} else if (time_after(jiffies, fib6_nh->last_probe +
+	} else if (time_after(jiffies, last_probe +
 				       idev->cnf.rtr_probe_interval)) {
 		work = kmalloc(sizeof(*work), GFP_ATOMIC);
 	}
 
-	if (work) {
-		fib6_nh->last_probe = jiffies;
+	if (!work || cmpxchg(&fib6_nh->last_probe,
+			     last_probe, jiffies) != last_probe) {
+		kfree(work);
+	} else {
 		INIT_WORK(&work->work, rt6_probe_deferred);
 		work->target = *nh_gw;
 		dev_hold(dev);
@@ -3383,6 +3387,9 @@  int fib6_nh_init(struct net *net, struct fib6_nh *fib6_nh,
 	int err;
 
 	fib6_nh->fib_nh_family = AF_INET6;
+#ifdef CONFIG_IPV6_ROUTER_PREF
+	fib6_nh->last_probe = jiffies;
+#endif
 
 	err = -ENODEV;
 	if (cfg->fc_ifindex) {