diff mbox series

[v2] net: route: Fix vrf dst_entry ref count false increasing

Message ID 1a4c0c31-e74c-5167-0668-328dd342005e@huawei.com
State Rejected
Delegated to: David Miller
Headers show
Series [v2] net: route: Fix vrf dst_entry ref count false increasing | expand

Commit Message

Miaohe Lin May 4, 2019, 1:13 p.m. UTC
From: Suanming.Mou <mousuanming@huawei.com>

When config ip in default vrf same as the ip in specified
vrf, fib_lookup will return the route from table local
even if the in device is an enslaved l3mdev. Then the
dst_entry will hold the vrf device rather than loopback
device in local_input of function ip_route_input_slow.
So vrf dst_entry is false increased by route from table
local because device passed to rt_dst_alloc is in device
rather than fib result device.

Here is reproduce step:
1.enslave enp4s0 to vrf2, and config ip address:
ip link add vrf2 type vrf table 1
ip link set vrf2 up
ip link set enp4s0 master vrf2
ip addr ad 125.1.1.1/16 dev enp4s0

2.config same ip in default vrf:
ip addr ad 125.1.1.1/16 dev enp6s0

3.config peer and ping:
ip vrf exec vrf2 ping 125.1.1.2 -c 3

4.del vrf2 link:
ip link del vrf2

System hang with del vrf2 ops and "unregister_netdevice:
waiting for vrf2 to become free. Usage count = 1" occur.

Reported-by: Hui Wang <wanghui104@huawei.com>
Signed-off-by: Suanming.Mou <mousuanming@huawei.com>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 net/ipv4/route.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

David Ahern May 4, 2019, 2:59 p.m. UTC | #1
On 5/4/19 7:13 AM, linmiaohe wrote:
> From: Suanming.Mou <mousuanming@huawei.com>
> 
> When config ip in default vrf same as the ip in specified
> vrf, fib_lookup will return the route from table local
> even if the in device is an enslaved l3mdev. Then the

you need to move the local rule with a preference of 0 after the l3mdev
rule.
Miaohe Lin May 5, 2019, 2:11 a.m. UTC | #2
On 2019/5/4 22:59, David Ahern wrote:
> On 5/4/19 7:13 AM, linmiaohe wrote:
>> From: Suanming.Mou <mousuanming@huawei.com>
>>
>> When config ip in default vrf same as the ip in specified
>> vrf, fib_lookup will return the route from table local
>> even if the in device is an enslaved l3mdev. Then the
> 
> you need to move the local rule with a preference of 0 after the l3mdev
> rule.
> 
> 

Move the local rule after l3mdev rule can get rid of this problem. And
even if this happend, we can delete the same ip address in default vrf
to fix it.
But I think maybe it's still a problem because other rule with default
vrf out device holds the specified vrf device. It looks unreasonable.

Many Thanks.
David Ahern May 5, 2019, 2:56 a.m. UTC | #3
On 5/4/19 8:11 PM, linmiaohe wrote:
> 
> 
> On 2019/5/4 22:59, David Ahern wrote:
>> On 5/4/19 7:13 AM, linmiaohe wrote:
>>> From: Suanming.Mou <mousuanming@huawei.com>
>>>
>>> When config ip in default vrf same as the ip in specified
>>> vrf, fib_lookup will return the route from table local
>>> even if the in device is an enslaved l3mdev. Then the
>>
>> you need to move the local rule with a preference of 0 after the l3mdev
>> rule.
>>
>>
> 
> Move the local rule after l3mdev rule can get rid of this problem. And
> even if this happend, we can delete the same ip address in default vrf
> to fix it.
> But I think maybe it's still a problem because other rule with default
> vrf out device holds the specified vrf device. It looks unreasonable.
> 
> Many Thanks.
> 

VRF is implemented using policy routing. If you do not move the local
rule below the l3mdev rule, you are doing a lookup in the local table
first, then vrf table, then main table. Doing the local table first can
result in false hits - like the case of duplicate IP addresses in
default VRF and a VRF. In short, it is just wrong.

Looking at the VRF documentation in the kernel tree I do see such a
comment is missing, but I do mention in all of the VRF tutorials such as
this one (see slide 79):

http://schd.ws/hosted_files/ossna2017/fe/vrf-tutorial-oss.pdf
Miaohe Lin May 5, 2019, 3:28 a.m. UTC | #4
On 2019/5/5 10:56, David Ahern wrote:
> On 5/4/19 8:11 PM, linmiaohe wrote:
>>
>>
>> On 2019/5/4 22:59, David Ahern wrote:
>>> On 5/4/19 7:13 AM, linmiaohe wrote:
>>>> From: Suanming.Mou <mousuanming@huawei.com>
>>>>
>>>> When config ip in default vrf same as the ip in specified
>>>> vrf, fib_lookup will return the route from table local
>>>> even if the in device is an enslaved l3mdev. Then the
>>>
>>> you need to move the local rule with a preference of 0 after the l3mdev
>>> rule.
>>>
>>>
>>
>> Move the local rule after l3mdev rule can get rid of this problem. And
>> even if this happend, we can delete the same ip address in default vrf
>> to fix it.
>> But I think maybe it's still a problem because other rule with default
>> vrf out device holds the specified vrf device. It looks unreasonable.
>>
>> Many Thanks.
>>
> 
> VRF is implemented using policy routing. If you do not move the local
> rule below the l3mdev rule, you are doing a lookup in the local table
> first, then vrf table, then main table. Doing the local table first can
> result in false hits - like the case of duplicate IP addresses in
> default VRF and a VRF. In short, it is just wrong.
> 
> Looking at the VRF documentation in the kernel tree I do see such a
> comment is missing, but I do mention in all of the VRF tutorials such as
> this one (see slide 79):
> 
> http://schd.ws/hosted_files/ossna2017/fe/vrf-tutorial-oss.pdf
> 
> .
> 

Thanks for your patience. As it's just like the case of duplicate
IP addresses in default VRF and a VRF and can be fix by hand, we
needn't this patch anymore.

Best regards.
diff mbox series

Patch

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 6fdf1c195d8e..74def8710ae8 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2077,6 +2077,11 @@  out:	return err;
 			}
 			do_cache = true;
 		}
+		/* Use fib res nh_dev as local input device because enslaved
+		 * l3mdev may hit route from other rule table. Dst_entry
+		 * should hold right device.
+		 */
+		dev = FIB_RES_DEV(*res);
 	}

 	rth = rt_dst_alloc(l3mdev_master_dev_rcu(dev) ? : net->loopback_dev,