Message ID | f98ccbfee35a7ab7f9499ede9b34e6b1120d88f1.1404172155.git.decot@googlers.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
From: David Decotigny <decot@googlers.com> Date: Mon, 30 Jun 2014 16:50:10 -0700 > This ensures that the ndo_netpoll_cleanup callback is called for every > device that provides one. Otherwise there is a risk of reference leak > with bonding for example, which depends on this callback to cleanup > the slaves' references to netpoll info. > > Tested: > see patch "netpoll: fix use after free" > > Signed-off-by: David Decotigny <decot@googlers.com> I definitely don't understand this. Why would we call the cleanup function of an object before it's reference count hits zero? It is exactly the act of reaching a zero refcount which should trigger invoking the cleanup callback. If a refcount is being released in another location without checking if it hits zero and invoking the cleanup if so, _THAT_ is the bug. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, Thanks for the feedback. This patch results from manual inspection of the code. I agree my commit description is abusive: in the case of bonding, I think everything is fine, there should be no ref leak, cleanup paths seem clean. My point was to make things more predictable: ndo_netpoll_cleanup called anyways to acknowledge actual loss of a ref to npinfo, irrespective of whether it's the last ref or not. Without this patch, calling ndo_netpoll_cleanup would depend on some timing behavior, hard to predict, and users of the API have better be careful to reclaim the refs manually anyways: as a consequence, not sure this callback is actually required in its current inception. On Mon, Jul 7, 2014 at 7:35 PM, David Miller <davem@davemloft.net> wrote: > From: David Decotigny <decot@googlers.com> > Date: Mon, 30 Jun 2014 16:50:10 -0700 > >> This ensures that the ndo_netpoll_cleanup callback is called for every >> device that provides one. Otherwise there is a risk of reference leak >> with bonding for example, which depends on this callback to cleanup >> the slaves' references to netpoll info. >> >> Tested: >> see patch "netpoll: fix use after free" >> >> Signed-off-by: David Decotigny <decot@googlers.com> > > I definitely don't understand this. > > Why would we call the cleanup function of an object before it's > reference count hits zero? It is exactly the act of reaching a > zero refcount which should trigger invoking the cleanup callback. > > If a refcount is being released in another location without checking > if it hits zero and invoking the cleanup if so, _THAT_ is the bug. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: David Decotigny <decot@googlers.com> Date: Tue, 8 Jul 2014 12:35:14 -0700 > Thanks for the feedback. This patch results from manual inspection of > the code. I agree my commit description is abusive: in the case of > bonding, I think everything is fine, there should be no ref leak, > cleanup paths seem clean. > > My point was to make things more predictable: ndo_netpoll_cleanup > called anyways to acknowledge actual loss of a ref to npinfo, > irrespective of whether it's the last ref or not. Without this patch, > calling ndo_netpoll_cleanup would depend on some timing behavior, hard > to predict, and users of the API have better be careful to reclaim the > refs manually anyways: as a consequence, not sure this callback is > actually required in its current inception. You've increased my confusion rather than decreased it. You fail to address the core issue in my feedback: Whoever drops the refcount to zero must be the one to invoke the cleanup function. Please address this concisely, and directly. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
In that case, that's what the original code does: dropping this patch 2/2. Patch 1/2 "netpoll: fix use after free" is still needed to prevent panics, though. On Tue, Jul 8, 2014 at 2:17 PM, David Miller <davem@davemloft.net> wrote: > From: David Decotigny <decot@googlers.com> > Date: Tue, 8 Jul 2014 12:35:14 -0700 > >> Thanks for the feedback. This patch results from manual inspection of >> the code. I agree my commit description is abusive: in the case of >> bonding, I think everything is fine, there should be no ref leak, >> cleanup paths seem clean. >> >> My point was to make things more predictable: ndo_netpoll_cleanup >> called anyways to acknowledge actual loss of a ref to npinfo, >> irrespective of whether it's the last ref or not. Without this patch, >> calling ndo_netpoll_cleanup would depend on some timing behavior, hard >> to predict, and users of the API have better be careful to reclaim the >> refs manually anyways: as a consequence, not sure this callback is >> actually required in its current inception. > > You've increased my confusion rather than decreased it. > > You fail to address the core issue in my feedback: > > Whoever drops the refcount to zero must be the one to invoke > the cleanup function. > > Please address this concisely, and directly. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: David Decotigny <decot@googlers.com> Date: Tue, 8 Jul 2014 14:50:24 -0700 > In that case, that's what the original code does: dropping this patch 2/2. > > Patch 1/2 "netpoll: fix use after free" is still needed to prevent > panics, though. Please resubmit it then. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/core/netpoll.c b/net/core/netpoll.c index 907fb5e..1e10d5d 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -802,6 +802,7 @@ static void rcu_cleanup_netpoll_info(struct rcu_head *rcu_head) void __netpoll_cleanup(struct netpoll *np) { struct netpoll_info *npinfo; + const struct net_device_ops *ops; /* rtnl_dereference would be preferable here but * rcu_cleanup_netpoll path can put us in here safely without @@ -813,17 +814,17 @@ void __netpoll_cleanup(struct netpoll *np) synchronize_srcu(&netpoll_srcu); - if (atomic_dec_and_test(&npinfo->refcnt)) { - const struct net_device_ops *ops; + ops = np->dev->netdev_ops; + if (ops->ndo_netpoll_cleanup) + ops->ndo_netpoll_cleanup(np->dev); - ops = np->dev->netdev_ops; - if (ops->ndo_netpoll_cleanup) - ops->ndo_netpoll_cleanup(np->dev); + /* before dropping ref count, make sure this device does not + * reference npinfo anymore + */ + RCU_INIT_POINTER(np->dev->npinfo, NULL); - RCU_INIT_POINTER(np->dev->npinfo, NULL); + if (atomic_dec_and_test(&npinfo->refcnt)) call_rcu_bh(&npinfo->rcu, rcu_cleanup_netpoll_info); - } else - RCU_INIT_POINTER(np->dev->npinfo, NULL); } EXPORT_SYMBOL_GPL(__netpoll_cleanup);
This ensures that the ndo_netpoll_cleanup callback is called for every device that provides one. Otherwise there is a risk of reference leak with bonding for example, which depends on this callback to cleanup the slaves' references to netpoll info. Tested: see patch "netpoll: fix use after free" Signed-off-by: David Decotigny <decot@googlers.com> --- net/core/netpoll.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-)