macvlan: Fix device ref leak when purging bc_queue

Submitted by Joe.Ghalam@dell.com on April 20, 2017, 4:09 p.m.

Details

Message ID 1492704599476.84165@Dell.com
State Not Applicable
Delegated to: David Miller
Headers show

Commit Message

Joe.Ghalam@dell.com April 20, 2017, 4:09 p.m.
Hi Herbert,
I agree with this change, but the same purge would be needed for the macvlan_dellink() call also. 
Thanks,
Joe

Comments

Herbert Xu April 21, 2017, 4:40 a.m.
On Thu, Apr 20, 2017 at 04:09:56PM +0000, Joe.Ghalam@dell.com wrote:
> I agree with this change, but the same purge would be needed for the macvlan_dellink() call also. 

I don't think that's necessary because as long as the master
device is still around it will continue to process the broadcast
queue, thus removing any reference counts held.

It's only when the queue is purged that we run into trouble.

Cheers,
Joe.Ghalam@dell.com April 21, 2017, 2:40 p.m.
That's not true. macvlan_dellink() unregisters the queue, and macvlan_process_broadcast() will never get called. Please note that I'm not speculating. I have traced enabled on the dev_put and dev_hold, and I'm reporting a real, reproducible issue.
Her is a sequence of calls logged, when the issue happens. macvlan_process_broadcast() never happens.

Apr 19 04:35:39 OS10 kernel: e101-001-0.v257: dev_put 16 dst_destroy
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_hold 15 dev_get_by_index
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 16 do_ip_setsockopt
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_hold 15 dst_alloc
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 16 dst_destroy
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_hold 15 macvlan_broadcast_enqueue
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 16 macvlan_process_broadcast
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_hold 15 macvlan_broadcast_enqueue
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 16 macvlan_process_broadcast
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_hold 15 macvlan_broadcast_enqueue    <---insert
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 16 neigh_destroy
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 15 neigh_destroy
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 14 neigh_destroy
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 13 dst_destroy
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 12 dst_destroy
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 11 __netdev_adjacent_dev_remove <--- macvlan_dellink()
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 10 __netdev_adjacent_dev_remove <--- macvlan_dellink()
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 9 neigh_parms_release
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 8 neigh_parms_release
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 7 in6_dev_finish_destroy
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 6 rx_queue_release
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 5 netdev_queue_release
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 4 rollback_registered_many
Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 3 free_fib_info_rcu
David Miller April 21, 2017, 3:02 p.m.
Please DO NOT top-post.

Thank you.
Mahesh Bandewar April 21, 2017, 7:23 p.m.
On Fri, Apr 21, 2017 at 7:40 AM,  <Joe.Ghalam@dell.com> wrote:
> That's not true. macvlan_dellink() unregisters the queue, and macvlan_process_broadcast() will never get called. Please note that I'm not speculating. I have traced enabled on the dev_put and dev_hold, and I'm reporting a real, reproducible issue.
> Her is a sequence of calls logged, when the issue happens. macvlan_process_broadcast() never happens.
>
> Apr 19 04:35:39 OS10 kernel: e101-001-0.v257: dev_put 16 dst_destroy
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_hold 15 dev_get_by_index
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 16 do_ip_setsockopt
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_hold 15 dst_alloc
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 16 dst_destroy
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_hold 15 macvlan_broadcast_enqueue
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 16 macvlan_process_broadcast
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_hold 15 macvlan_broadcast_enqueue
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 16 macvlan_process_broadcast
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_hold 15 macvlan_broadcast_enqueue    <---insert
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 16 neigh_destroy
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 15 neigh_destroy
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 14 neigh_destroy
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 13 dst_destroy
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 12 dst_destroy
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 11 __netdev_adjacent_dev_remove <--- macvlan_dellink()
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 10 __netdev_adjacent_dev_remove <--- macvlan_dellink()
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 9 neigh_parms_release
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 8 neigh_parms_release
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 7 in6_dev_finish_destroy
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 6 rx_queue_release
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 5 netdev_queue_release
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 4 rollback_registered_many
> Apr 19 04:35:41 OS10 kernel: e101-001-0.v257: dev_put 3 free_fib_info_rcu
>
May be the system is busy and snapshot is too small, and eventually
process_broadcast() should get called. Deleting a slave does nothing
about cancelling the work-queue so it would happen eventually.

The change that Herbert proposed is correct. When packets are enqueued
for processing later a dev reference is taken and it's removed when
it's processed when it gets scheduled. The backlog is per port so it
makes sense to remove reference(s) before purging the queue prior to
deleting the port.

> ________________________________________
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Sent: Thursday, April 20, 2017 9:40 PM
> To: Ghalam, Joe
> Cc: davem@davemloft.net; Wichmann, Clifford; netdev@vger.kernel.org
> Subject: Re: macvlan: Fix device ref leak when purging bc_queue
>
> On Thu, Apr 20, 2017 at 04:09:56PM +0000, Joe.Ghalam@dell.com wrote:
>> I agree with this change, but the same purge would be needed for the macvlan_dellink() call also.
>
> I don't think that's necessary because as long as the master
> device is still around it will continue to process the broadcast
> queue, thus removing any reference counts held.
>
> It's only when the queue is purged that we run into trouble.
>
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Joe.Ghalam@dell.com April 21, 2017, 8:37 p.m.

Herbert Xu April 24, 2017, 7:56 a.m.
On Fri, Apr 21, 2017 at 02:40:50PM +0000, Joe.Ghalam@dell.com wrote:
> That's not true. macvlan_dellink() unregisters the queue, and macvlan_process_broadcast() will never get called. Please note that I'm not speculating. I have traced enabled on the dev_put and dev_hold, and I'm reporting a real, reproducible issue.

The only thing that can stop macvlan_process_broadcast from getting
called is macvlan_port_destroy.  Nothing else can stop the work
queue, unless of course the work queue mechanism itself is broken.

So if you're sure macvlan_port_destroy is never even called in
your case, then you'll need to start debugging the kernel work
queue mechanism to see why macvlan_process_broadcast is not getting
called.

Cheers,
Joe.Ghalam@dell.com April 24, 2017, 3:01 p.m.
> The only thing that can stop macvlan_process_broadcast from getting
> called is macvlan_port_destroy.  Nothing else can stop the work
> queue, unless of course the work queue mechanism itself is broken.

> So if you're sure macvlan_port_destroy is never even called in
> your case, then you'll need to start debugging the kernel work
> queue mechanism to see why macvlan_process_broadcast is not getting
> called.

I will get your changes reloaded and re-tested without any other debug tools. Hopefully, we'll see success. I will let you know if I see any issues. 
Btw, is your fix committed already? if not, do you know when and where it would be committed?

Thanks,
Joe
David Miller April 24, 2017, 3:10 p.m.
From: <Joe.Ghalam@dell.com>
Date: Mon, 24 Apr 2017 15:01:24 +0000

>> The only thing that can stop macvlan_process_broadcast from getting
>> called is macvlan_port_destroy.  Nothing else can stop the work
>> queue, unless of course the work queue mechanism itself is broken.
> 
>> So if you're sure macvlan_port_destroy is never even called in
>> your case, then you'll need to start debugging the kernel work
>> queue mechanism to see why macvlan_process_broadcast is not getting
>> called.
> 
> I will get your changes reloaded and re-tested without any other debug tools. Hopefully, we'll see success. I will let you know if I see any issues. 
> Btw, is your fix committed already? if not, do you know when and where it would be committed?

I'm waiting for this discussion to settle down before I apply the
patch.
Joe.Ghalam@dell.com April 24, 2017, 3:30 p.m.
> I'm waiting for this discussion to settle down before I apply the patch.

Thanks David. I will get some answers soon, and hopefully the change is a good one.

Patch hide | download patch | download mbox

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 9261722..b34eaaa 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -1139,6 +1139,7 @@  static int macvlan_port_create(struct net_device *dev)
 static void macvlan_port_destroy(struct net_device *dev)
 {
        struct macvlan_port *port = macvlan_port_get_rtnl(dev);
+       struct sk_buff *skb;

        dev->priv_flags &= ~IFF_MACVLAN_PORT;
        netdev_rx_handler_unregister(dev);
@@ -1147,7 +1148,15 @@  static void macvlan_port_destroy(struct net_device *dev)
         * but we need to cancel it and purge left skbs if any.
         */
        cancel_work_sync(&port->bc_work);
-       __skb_queue_purge(&port->bc_queue);
+
+       while ((skb = __skb_dequeue(&port->bc_queue))) {
+               const struct macvlan_dev *src = MACVLAN_SKB_CB(skb)->src;
+
+               if (src)
+                       dev_put(src->dev);
+
+               kfree_skb(skb);
+       }

        kfree(port);
 }