diff mbox

ipv6: addrconf: fix dev refcont leak when DAD failed

Message ID 1473062791-21118-1-git-send-email-weiyongjun1@huawei.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Wei Yongjun Sept. 5, 2016, 8:06 a.m. UTC
In general, when DAD detected IPv6 duplicate address, ifp->state
will be set to INET6_IFADDR_STATE_ERRDAD and DAD is stopped by a
delayed work, the call tree should be like this:

ndisc_recv_ns
  -> addrconf_dad_failure        <- missing ifp put
     -> addrconf_mod_dad_work
       -> schedule addrconf_dad_work()
         -> addrconf_dad_stop()  <- missing ifp hold before call it

addrconf_dad_failure() called with ifp refcont holding but not put.
addrconf_dad_work() call addrconf_dad_stop() without extra holding
refcount. This will not cause any issue normally.

But the race between addrconf_dad_failure() and addrconf_dad_work()
may cause ifp refcount leak and netdevice can not be unregister,
dmesg show the following messages:

IPv6: eth0: IPv6 duplicate address fe80::XX:XXXX:XXXX:XX detected!
...
unregister_netdevice: waiting for eth0 to become free. Usage count = 1

Cc: stable@vger.kernel.org
Fixes: c15b1ccadb32 ("ipv6: move DAD and addrconf_verify processing
to workqueue")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>

Comments

Hannes Frederic Sowa Sept. 5, 2016, 10:02 a.m. UTC | #1
On 05.09.2016 10:06, Wei Yongjun wrote:
> In general, when DAD detected IPv6 duplicate address, ifp->state
> will be set to INET6_IFADDR_STATE_ERRDAD and DAD is stopped by a
> delayed work, the call tree should be like this:
> 
> ndisc_recv_ns
>   -> addrconf_dad_failure        <- missing ifp put
>      -> addrconf_mod_dad_work
>        -> schedule addrconf_dad_work()
>          -> addrconf_dad_stop()  <- missing ifp hold before call it
> 
> addrconf_dad_failure() called with ifp refcont holding but not put.
> addrconf_dad_work() call addrconf_dad_stop() without extra holding
> refcount. This will not cause any issue normally.
> 
> But the race between addrconf_dad_failure() and addrconf_dad_work()
> may cause ifp refcount leak and netdevice can not be unregister,
> dmesg show the following messages:
> 
> IPv6: eth0: IPv6 duplicate address fe80::XX:XXXX:XXXX:XX detected!
> ...
> unregister_netdevice: waiting for eth0 to become free. Usage count = 1
> 
> Cc: stable@vger.kernel.org
> Fixes: c15b1ccadb32 ("ipv6: move DAD and addrconf_verify processing
> to workqueue")
> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
> 
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index bdf368e..2f1f5d4 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -1948,6 +1948,7 @@ errdad:
>  	spin_unlock_bh(&ifp->lock);
>  
>  	addrconf_mod_dad_work(ifp, 0);
> +	in6_ifa_put(ifp);
>  }

This in6_ifa_put makes sense.

>  
>  /* Join to solicited addr multicast group.
> @@ -3857,6 +3858,7 @@ static void addrconf_dad_work(struct work_struct *w)
>  		addrconf_dad_begin(ifp);
>  		goto out;
>  	} else if (action == DAD_ABORT) {
> +		in6_ifa_hold(ifp);
>  		addrconf_dad_stop(ifp, 1);
>  		if (disable_ipv6)
>  			addrconf_ifdown(idev->dev, 0);
> 

But why you add a in6_ifa_hold here isn't clear to me. Could you explain
why this is necessary? I don't see any async stuff being done in
addrconf_dad_stop, thus the reference we already have should be
sufficient for the lifetime of addrconf_dad_stop.

Thanks,
Hannes
Wei Yongjun Sept. 5, 2016, 11:05 a.m. UTC | #2
On 05.09.2016 10:06, Wei Yongjun wrote:
>> In general, when DAD detected IPv6 duplicate address, ifp->state will 

>> be set to INET6_IFADDR_STATE_ERRDAD and DAD is stopped by a delayed 

>> work, the call tree should be like this:

>> 

>> ndisc_recv_ns

>>   -> addrconf_dad_failure        <- missing ifp put

>>      -> addrconf_mod_dad_work

>>        -> schedule addrconf_dad_work()

>>          -> addrconf_dad_stop()  <- missing ifp hold before call it

>> 

>> addrconf_dad_failure() called with ifp refcont holding but not put.

>> addrconf_dad_work() call addrconf_dad_stop() without extra holding 

>> refcount. This will not cause any issue normally.

>> 

>> But the race between addrconf_dad_failure() and addrconf_dad_work() 

>> may cause ifp refcount leak and netdevice can not be unregister, dmesg 

>> show the following messages:

>> 

>> IPv6: eth0: IPv6 duplicate address fe80::XX:XXXX:XXXX:XX detected!

>> ...

>> unregister_netdevice: waiting for eth0 to become free. Usage count = 1

>> 

>> Cc: stable@vger.kernel.org

>> Fixes: c15b1ccadb32 ("ipv6: move DAD and addrconf_verify processing to 

>> workqueue")

>> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>

>> 

>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 

>> bdf368e..2f1f5d4 100644

>> --- a/net/ipv6/addrconf.c

>> +++ b/net/ipv6/addrconf.c

>> @@ -1948,6 +1948,7 @@ errdad:

>>  	spin_unlock_bh(&ifp->lock);

>>  

>>  	addrconf_mod_dad_work(ifp, 0);

>> +	in6_ifa_put(ifp);

>>  }


>This in6_ifa_put makes sense.


>>  

>>  /* Join to solicited addr multicast group.

>> @@ -3857,6 +3858,7 @@ static void addrconf_dad_work(struct work_struct *w)

>>  		addrconf_dad_begin(ifp);

>>  		goto out;

>>  	} else if (action == DAD_ABORT) {

>> +		in6_ifa_hold(ifp);

>>  		addrconf_dad_stop(ifp, 1);

>>  		if (disable_ipv6)

>>  			addrconf_ifdown(idev->dev, 0);

>> 


>But why you add a in6_ifa_hold here isn't clear to me. Could you explain why this is

>necessary? I don't see any async stuff being done in addrconf_dad_stop, thus the

>reference we already have should be sufficient for the lifetime of addrconf_dad_stop.


I think it that link local is added with flag IFA_F_PERMANENT, which we real need
it is to remove in6_ifa_put() in addrconf_dad_stop.

static void addrconf_dad_stop(...)
{
	if (ifp->flags&IFA_F_PERMANENT) {
		...
		in6_ifa_put(ifp);   <== remove this line since caller hold refcount
	} else if (ifp->flags&IFA_F_TEMPORARY) {
		...
		ipv6_del_addr(ifp);
	} else {
		ipv6_del_addr(ifp);
	}
}

If so, the addrconf_dad_begin() also need to fix because if hold a ref before
addrconf_dad_stop():

static void addrconf_dad_begin(struct inet6_ifaddr *ifp)
{
 ...
  	in6_ifa_hold(ifp);  <-- remove this line 
	addrconf_dad_stop(ifp, 0);
 ...
}

Also inet6_addr_del which called ipv6_del_addr with refcount hold:
inet6_addr_del(...)
{
  ...
  list_for_each_entry(...) {
    ...
    in6_ifa_hold(ifp);   <-- remove this line
	...
	ipv6_del_addr(ifp);
	...
  }
  ...
}

Regards,
Yongjun Wei
Wei Yongjun Sept. 5, 2016, 11:54 a.m. UTC | #3
On 05.09.2016 10:06, Wei Yongjun wrote:
>>> In general, when DAD detected IPv6 duplicate address, ifp->state will 

>>> be set to INET6_IFADDR_STATE_ERRDAD and DAD is stopped by a delayed 

>>> work, the call tree should be like this:

>>> 

>>> ndisc_recv_ns

>>>   -> addrconf_dad_failure        <- missing ifp put

>>>      -> addrconf_mod_dad_work

>>>        -> schedule addrconf_dad_work()

>>>          -> addrconf_dad_stop()  <- missing ifp hold before call it

>>> 

>>> addrconf_dad_failure() called with ifp refcont holding but not put.

>>> addrconf_dad_work() call addrconf_dad_stop() without extra holding 

>>> refcount. This will not cause any issue normally.

>>> 

>>> But the race between addrconf_dad_failure() and addrconf_dad_work() 

>>> may cause ifp refcount leak and netdevice can not be unregister, 

>>> dmesg show the following messages:

>>> 

>>> IPv6: eth0: IPv6 duplicate address fe80::XX:XXXX:XXXX:XX detected!

>>> ...

>>> unregister_netdevice: waiting for eth0 to become free. Usage count = 

>>> 1

>>> 

>>> Cc: stable@vger.kernel.org

>>> Fixes: c15b1ccadb32 ("ipv6: move DAD and addrconf_verify processing 

>>> to

>>> workqueue")

>>> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>

>>> 

>>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index

>>> bdf368e..2f1f5d4 100644

>>> --- a/net/ipv6/addrconf.c

>>> +++ b/net/ipv6/addrconf.c

>>> @@ -1948,6 +1948,7 @@ errdad:

>>>  	spin_unlock_bh(&ifp->lock);

>>>  

>>>  	addrconf_mod_dad_work(ifp, 0);

>>> +	in6_ifa_put(ifp);

>>>  }


>>This in6_ifa_put makes sense.


>>>  

>>>  /* Join to solicited addr multicast group.

>>> @@ -3857,6 +3858,7 @@ static void addrconf_dad_work(struct work_struct *w)

>>>  		addrconf_dad_begin(ifp);

>>>  		goto out;

>>>  	} else if (action == DAD_ABORT) {

>>> +		in6_ifa_hold(ifp);

>>>  		addrconf_dad_stop(ifp, 1);

>>>  		if (disable_ipv6)

>>>  			addrconf_ifdown(idev->dev, 0);

>>> 


>>But why you add a in6_ifa_hold here isn't clear to me. Could you 

>>explain why this is necessary? I don't see any async stuff being done 

>>in addrconf_dad_stop, thus the reference we already have should be sufficient for the lifetime of addrconf_dad_stop.


> I think it that link local is added with flag IFA_F_PERMANENT, which we real need it is to remove in6_ifa_put() in addrconf_dad_stop.


> static void addrconf_dad_stop(...)

> {

>	if (ifp->flags&IFA_F_PERMANENT) {

>		...

>		in6_ifa_put(ifp);   <== remove this line since caller hold refcount

>	} else if (ifp->flags&IFA_F_TEMPORARY) {

>		...

>		ipv6_del_addr(ifp);

>	} else {

>		ipv6_del_addr(ifp);

>	}

>}


I see the comment of ipv6_del_addr:

/* This function wants to get referenced ifp and releases it before return */
static void ipv6_del_addr(struct inet6_ifaddr *ifp)
{
   ...
}

Both in6_ifa_put() and ipv6_del_addr() need a refcount holding, so we should
use a extra in6_ifa_hold() call before call addrconf_dad_stop() as the origin patch.

Regards,
Wei Yongjun
Hannes Frederic Sowa Sept. 5, 2016, 12:03 p.m. UTC | #4
On 05.09.2016 13:54, weiyongjun (A) wrote:
> On 05.09.2016 10:06, Wei Yongjun wrote:
>>>> In general, when DAD detected IPv6 duplicate address, ifp->state will 
>>>> be set to INET6_IFADDR_STATE_ERRDAD and DAD is stopped by a delayed 
>>>> work, the call tree should be like this:
>>>>
>>>> ndisc_recv_ns
>>>>   -> addrconf_dad_failure        <- missing ifp put
>>>>      -> addrconf_mod_dad_work
>>>>        -> schedule addrconf_dad_work()
>>>>          -> addrconf_dad_stop()  <- missing ifp hold before call it
>>>>
>>>> addrconf_dad_failure() called with ifp refcont holding but not put.
>>>> addrconf_dad_work() call addrconf_dad_stop() without extra holding 
>>>> refcount. This will not cause any issue normally.
>>>>
>>>> But the race between addrconf_dad_failure() and addrconf_dad_work() 
>>>> may cause ifp refcount leak and netdevice can not be unregister, 
>>>> dmesg show the following messages:
>>>>
>>>> IPv6: eth0: IPv6 duplicate address fe80::XX:XXXX:XXXX:XX detected!
>>>> ...
>>>> unregister_netdevice: waiting for eth0 to become free. Usage count = 
>>>> 1
>>>>
>>>> Cc: stable@vger.kernel.org
>>>> Fixes: c15b1ccadb32 ("ipv6: move DAD and addrconf_verify processing 
>>>> to
>>>> workqueue")
>>>> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
>>>>
>>>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index
>>>> bdf368e..2f1f5d4 100644
>>>> --- a/net/ipv6/addrconf.c
>>>> +++ b/net/ipv6/addrconf.c
>>>> @@ -1948,6 +1948,7 @@ errdad:
>>>>  	spin_unlock_bh(&ifp->lock);
>>>>  
>>>>  	addrconf_mod_dad_work(ifp, 0);
>>>> +	in6_ifa_put(ifp);
>>>>  }
> 
>>> This in6_ifa_put makes sense.
> 
>>>>  
>>>>  /* Join to solicited addr multicast group.
>>>> @@ -3857,6 +3858,7 @@ static void addrconf_dad_work(struct work_struct *w)
>>>>  		addrconf_dad_begin(ifp);
>>>>  		goto out;
>>>>  	} else if (action == DAD_ABORT) {
>>>> +		in6_ifa_hold(ifp);
>>>>  		addrconf_dad_stop(ifp, 1);
>>>>  		if (disable_ipv6)
>>>>  			addrconf_ifdown(idev->dev, 0);
>>>>
> 
>>> But why you add a in6_ifa_hold here isn't clear to me. Could you 
>>> explain why this is necessary? I don't see any async stuff being done 
>>> in addrconf_dad_stop, thus the reference we already have should be sufficient for the lifetime of addrconf_dad_stop.
> 
>> I think it that link local is added with flag IFA_F_PERMANENT, which we real need it is to remove in6_ifa_put() in addrconf_dad_stop.
> 
>> static void addrconf_dad_stop(...)
>> {
>> 	if (ifp->flags&IFA_F_PERMANENT) {
>> 		...
>> 		in6_ifa_put(ifp);   <== remove this line since caller hold refcount
>> 	} else if (ifp->flags&IFA_F_TEMPORARY) {
>> 		...
>> 		ipv6_del_addr(ifp);
>> 	} else {
>> 		ipv6_del_addr(ifp);
>> 	}
>> }
> 
> I see the comment of ipv6_del_addr:
> 
> /* This function wants to get referenced ifp and releases it before return */
> static void ipv6_del_addr(struct inet6_ifaddr *ifp)
> {
>    ...
> }
> 
> Both in6_ifa_put() and ipv6_del_addr() need a refcount holding, so we should
> use a extra in6_ifa_hold() call before call addrconf_dad_stop() as the origin patch.

You are correct!

We always call in6_ifa_put() at the end of addrconf_dad_work. In this
case it is not correct. Either we add a in6_ifa_hold or we suppress or
jump over the last in6_ifa_put.

Good catch!

Hannes
David Miller Sept. 6, 2016, 9:19 p.m. UTC | #5
From: Wei Yongjun <weiyongjun1@huawei.com>
Date: Mon, 5 Sep 2016 16:06:31 +0800

> In general, when DAD detected IPv6 duplicate address, ifp->state
> will be set to INET6_IFADDR_STATE_ERRDAD and DAD is stopped by a
> delayed work, the call tree should be like this:
> 
> ndisc_recv_ns
>   -> addrconf_dad_failure        <- missing ifp put
>      -> addrconf_mod_dad_work
>        -> schedule addrconf_dad_work()
>          -> addrconf_dad_stop()  <- missing ifp hold before call it
> 
> addrconf_dad_failure() called with ifp refcont holding but not put.
> addrconf_dad_work() call addrconf_dad_stop() without extra holding
> refcount. This will not cause any issue normally.
> 
> But the race between addrconf_dad_failure() and addrconf_dad_work()
> may cause ifp refcount leak and netdevice can not be unregister,
> dmesg show the following messages:
> 
> IPv6: eth0: IPv6 duplicate address fe80::XX:XXXX:XXXX:XX detected!
> ...
> unregister_netdevice: waiting for eth0 to become free. Usage count = 1
> 
> Cc: stable@vger.kernel.org
> Fixes: c15b1ccadb32 ("ipv6: move DAD and addrconf_verify processing
> to workqueue")
> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>

Applied and queued up for -stable.
diff mbox

Patch

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index bdf368e..2f1f5d4 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1948,6 +1948,7 @@  errdad:
 	spin_unlock_bh(&ifp->lock);
 
 	addrconf_mod_dad_work(ifp, 0);
+	in6_ifa_put(ifp);
 }
 
 /* Join to solicited addr multicast group.
@@ -3857,6 +3858,7 @@  static void addrconf_dad_work(struct work_struct *w)
 		addrconf_dad_begin(ifp);
 		goto out;
 	} else if (action == DAD_ABORT) {
+		in6_ifa_hold(ifp);
 		addrconf_dad_stop(ifp, 1);
 		if (disable_ipv6)
 			addrconf_ifdown(idev->dev, 0);