diff mbox series

[net] net: fix memory leak in register_netdevice() on error path

Message ID 20201126132312.3593725-1-yangyingliang@huawei.com
State Superseded
Headers show
Series [net] net: fix memory leak in register_netdevice() on error path | expand

Commit Message

Yang Yingliang Nov. 26, 2020, 1:23 p.m. UTC
I got a memleak report when doing fault-inject test:

unreferenced object 0xffff88810ace9000 (size 1024):
  comm "ip", pid 4622, jiffies 4295457037 (age 43.378s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000008abe41>] __kmalloc+0x10f/0x210
    [<000000005d3533a6>] veth_dev_init+0x140/0x310
    [<0000000088353c64>] register_netdevice+0x496/0x7a0
    [<000000001324d322>] veth_newlink+0x40b/0x960
    [<00000000d0799866>] __rtnl_newlink+0xd8c/0x1360
    [<00000000d616040a>] rtnl_newlink+0x6b/0xa0
    [<00000000e0a1600d>] rtnetlink_rcv_msg+0x3cc/0x9e0
    [<000000009eeff98b>] netlink_rcv_skb+0x130/0x3a0
    [<00000000500f8be1>] netlink_unicast+0x4da/0x700
    [<00000000666c03b3>] netlink_sendmsg+0x7fe/0xcb0
    [<0000000073b28103>] sock_sendmsg+0x143/0x180
    [<00000000ad746a30>] ____sys_sendmsg+0x677/0x810
    [<0000000087dd98e5>] ___sys_sendmsg+0x105/0x180
    [<00000000028dd365>] __sys_sendmsg+0xf0/0x1c0
    [<00000000a6bfbae6>] do_syscall_64+0x33/0x40
    [<00000000e00521b4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

It seems ifb and loopback may also hit the leak, so I try to fix this in
register_netdevice().

In common case, priv_destructor() will be called in netdev_run_todo()
after calling ndo_uninit() in rollback_registered(), on other error
path in register_netdevice(), ndo_uninit() and priv_destructor() are
called before register_netdevice() return, but in this case,
priv_destructor() will never be called, then it causes memory leak,
so we should call priv_destructor() here.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
---
 net/core/dev.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Toshiaki Makita Nov. 29, 2020, 1:56 p.m. UTC | #1
On 2020/11/26 22:23, Yang Yingliang wrote:
...
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
> ---
>   net/core/dev.c | 11 +++++++++++
>   1 file changed, 11 insertions(+)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 82dc6b48e45f..907204395b64 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -10000,6 +10000,17 @@ int register_netdevice(struct net_device *dev)
>   	ret = notifier_to_errno(ret);
>   	if (ret) {
>   		rollback_registered(dev);
> +		/*
> +		 * In common case, priv_destructor() will be

As per netdev-faq, the comment style should be

/* foobar blah blah blah
  * another line of text
  */

rather than

/*
  * foobar blah blah blah
  * another line of text
  */

> +		 * called in netdev_run_todo() after calling
> +		 * ndo_uninit() in rollback_registered().
> +		 * But in this case, priv_destructor() will
> +		 * never be called, then it causes memory
> +		 * leak, so we should call priv_destructor()
> +		 * here.
> +		 */
> +		if (dev->priv_destructor)
> +			dev->priv_destructor(dev);

To be in line with netdev_run_todo(), I think priv_destructor() should be
called after "dev->reg_state = NETREG_UNREGISTERED".

Toshiaki Makita

>   		rcu_barrier();
>   
>   		dev->reg_state = NETREG_UNREGISTERED;
>
Stephen Hemminger Nov. 30, 2020, 4:39 a.m. UTC | #2
On Thu, 26 Nov 2020 21:23:12 +0800
Yang Yingliang <yangyingliang@huawei.com> wrote:

> I got a memleak report when doing fault-inject test:
> 
> unreferenced object 0xffff88810ace9000 (size 1024):
>   comm "ip", pid 4622, jiffies 4295457037 (age 43.378s)
>   hex dump (first 32 bytes):
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<00000000008abe41>] __kmalloc+0x10f/0x210
>     [<000000005d3533a6>] veth_dev_init+0x140/0x310
>     [<0000000088353c64>] register_netdevice+0x496/0x7a0
>     [<000000001324d322>] veth_newlink+0x40b/0x960
>     [<00000000d0799866>] __rtnl_newlink+0xd8c/0x1360
>     [<00000000d616040a>] rtnl_newlink+0x6b/0xa0
>     [<00000000e0a1600d>] rtnetlink_rcv_msg+0x3cc/0x9e0
>     [<000000009eeff98b>] netlink_rcv_skb+0x130/0x3a0
>     [<00000000500f8be1>] netlink_unicast+0x4da/0x700
>     [<00000000666c03b3>] netlink_sendmsg+0x7fe/0xcb0
>     [<0000000073b28103>] sock_sendmsg+0x143/0x180
>     [<00000000ad746a30>] ____sys_sendmsg+0x677/0x810
>     [<0000000087dd98e5>] ___sys_sendmsg+0x105/0x180
>     [<00000000028dd365>] __sys_sendmsg+0xf0/0x1c0
>     [<00000000a6bfbae6>] do_syscall_64+0x33/0x40
>     [<00000000e00521b4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> It seems ifb and loopback may also hit the leak, so I try to fix this in
> register_netdevice().
> 
> In common case, priv_destructor() will be called in netdev_run_todo()
> after calling ndo_uninit() in rollback_registered(), on other error
> path in register_netdevice(), ndo_uninit() and priv_destructor() are
> called before register_netdevice() return, but in this case,
> priv_destructor() will never be called, then it causes memory leak,
> so we should call priv_destructor() here.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
> ---
>  net/core/dev.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 82dc6b48e45f..907204395b64 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -10000,6 +10000,17 @@ int register_netdevice(struct net_device *dev)
>  	ret = notifier_to_errno(ret);
>  	if (ret) {
>  		rollback_registered(dev);
> +		/*
> +		 * In common case, priv_destructor() will be
> +		 * called in netdev_run_todo() after calling
> +		 * ndo_uninit() in rollback_registered().
> +		 * But in this case, priv_destructor() will
> +		 * never be called, then it causes memory
> +		 * leak, so we should call priv_destructor()
> +		 * here.
> +		 */
> +		if (dev->priv_destructor)
> +			dev->priv_destructor(dev);

Are you sure this is safe?
Several devices have destructors that call free_netdev.
Up until now a common pattern for those devices was to call
free_netdev on error. After this change it would lead to double free.
Yang Yingliang Nov. 30, 2020, 11:12 a.m. UTC | #3
On 2020/11/30 12:39, Stephen Hemminger wrote:
> On Thu, 26 Nov 2020 21:23:12 +0800
> Yang Yingliang <yangyingliang@huawei.com> wrote:
>
>> I got a memleak report when doing fault-inject test:
>>
>> unreferenced object 0xffff88810ace9000 (size 1024):
>>    comm "ip", pid 4622, jiffies 4295457037 (age 43.378s)
>>    hex dump (first 32 bytes):
>>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>    backtrace:
>>      [<00000000008abe41>] __kmalloc+0x10f/0x210
>>      [<000000005d3533a6>] veth_dev_init+0x140/0x310
>>      [<0000000088353c64>] register_netdevice+0x496/0x7a0
>>      [<000000001324d322>] veth_newlink+0x40b/0x960
>>      [<00000000d0799866>] __rtnl_newlink+0xd8c/0x1360
>>      [<00000000d616040a>] rtnl_newlink+0x6b/0xa0
>>      [<00000000e0a1600d>] rtnetlink_rcv_msg+0x3cc/0x9e0
>>      [<000000009eeff98b>] netlink_rcv_skb+0x130/0x3a0
>>      [<00000000500f8be1>] netlink_unicast+0x4da/0x700
>>      [<00000000666c03b3>] netlink_sendmsg+0x7fe/0xcb0
>>      [<0000000073b28103>] sock_sendmsg+0x143/0x180
>>      [<00000000ad746a30>] ____sys_sendmsg+0x677/0x810
>>      [<0000000087dd98e5>] ___sys_sendmsg+0x105/0x180
>>      [<00000000028dd365>] __sys_sendmsg+0xf0/0x1c0
>>      [<00000000a6bfbae6>] do_syscall_64+0x33/0x40
>>      [<00000000e00521b4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>
>> It seems ifb and loopback may also hit the leak, so I try to fix this in
>> register_netdevice().
>>
>> In common case, priv_destructor() will be called in netdev_run_todo()
>> after calling ndo_uninit() in rollback_registered(), on other error
>> path in register_netdevice(), ndo_uninit() and priv_destructor() are
>> called before register_netdevice() return, but in this case,
>> priv_destructor() will never be called, then it causes memory leak,
>> so we should call priv_destructor() here.
>>
>> Reported-by: Hulk Robot <hulkci@huawei.com>
>> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
>> ---
>>   net/core/dev.c | 11 +++++++++++
>>   1 file changed, 11 insertions(+)
>>
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 82dc6b48e45f..907204395b64 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -10000,6 +10000,17 @@ int register_netdevice(struct net_device *dev)
>>   	ret = notifier_to_errno(ret);
>>   	if (ret) {
>>   		rollback_registered(dev);
>> +		/*
>> +		 * In common case, priv_destructor() will be
>> +		 * called in netdev_run_todo() after calling
>> +		 * ndo_uninit() in rollback_registered().
>> +		 * But in this case, priv_destructor() will
>> +		 * never be called, then it causes memory
>> +		 * leak, so we should call priv_destructor()
>> +		 * here.
>> +		 */
>> +		if (dev->priv_destructor)
>> +			dev->priv_destructor(dev);
> Are you sure this is safe?
> Several devices have destructors that call free_netdev.
> Up until now a common pattern for those devices was to call
> free_netdev on error. After this change it would lead to double free.

After commit cf124db566e6 ("net: Fix inconsistent teardown and release 
of private netdev state."),

free_netdev() is not be called in priv_destructor().

>
> .
Yang Yingliang Nov. 30, 2020, 11:13 a.m. UTC | #4
On 2020/11/29 21:56, Toshiaki Makita wrote:
> On 2020/11/26 22:23, Yang Yingliang wrote:
> ...
>> Reported-by: Hulk Robot <hulkci@huawei.com>
>> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
>> ---
>>   net/core/dev.c | 11 +++++++++++
>>   1 file changed, 11 insertions(+)
>>
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 82dc6b48e45f..907204395b64 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -10000,6 +10000,17 @@ int register_netdevice(struct net_device *dev)
>>       ret = notifier_to_errno(ret);
>>       if (ret) {
>>           rollback_registered(dev);
>> +        /*
>> +         * In common case, priv_destructor() will be
>
> As per netdev-faq, the comment style should be
>
> /* foobar blah blah blah
>  * another line of text
>  */
>
> rather than
>
> /*
>  * foobar blah blah blah
>  * another line of text
>  */
>
>> +         * called in netdev_run_todo() after calling
>> +         * ndo_uninit() in rollback_registered().
>> +         * But in this case, priv_destructor() will
>> +         * never be called, then it causes memory
>> +         * leak, so we should call priv_destructor()
>> +         * here.
>> +         */
>> +        if (dev->priv_destructor)
>> +            dev->priv_destructor(dev);
>
> To be in line with netdev_run_todo(), I think priv_destructor() should be
> called after "dev->reg_state = NETREG_UNREGISTERED".
OK,  I will send a v2 later.
>
> Toshiaki Makita
>
>>           rcu_barrier();
>>             dev->reg_state = NETREG_UNREGISTERED;
>>
diff mbox series

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 82dc6b48e45f..907204395b64 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10000,6 +10000,17 @@  int register_netdevice(struct net_device *dev)
 	ret = notifier_to_errno(ret);
 	if (ret) {
 		rollback_registered(dev);
+		/*
+		 * In common case, priv_destructor() will be
+		 * called in netdev_run_todo() after calling
+		 * ndo_uninit() in rollback_registered().
+		 * But in this case, priv_destructor() will
+		 * never be called, then it causes memory
+		 * leak, so we should call priv_destructor()
+		 * here.
+		 */
+		if (dev->priv_destructor)
+			dev->priv_destructor(dev);
 		rcu_barrier();
 
 		dev->reg_state = NETREG_UNREGISTERED;