diff mbox

[RFC] Suspicious bug in module refcounting

Message ID 20090209151830.GC6018@dhcp35.suse.cz
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Michal Hocko Feb. 9, 2009, 3:18 p.m. UTC
Hi David,

On Wed 04-02-09 14:18:08, Rusty Russell wrote:
> On Wednesday 04 February 2009 00:17:21 Karsten Keil wrote:
> > The refcount is a per CPU atomic variable, module_refcount() simple add
> > in a fully unprotected loop (not disabled irqs, not protected against
> > scheduling) all per cpu values.
> 
> Hi Karsten,
> 
>    Yes, the BUG_ON() is overly aggressive.  And I really hate __module_get,
> and it looks like most of the callers are completely bogus.  The watchdog
> drivers use it to nail themselves in place in their open routines: this is
> OK, if a bit weird.
> 
>    We should only use __module_get() when you *can't handle* failure;
> otherwise you should accept that the admin did rmmod --wait and don't use the
> module any further.
> 
>   dmaengine.c seems to be taking liberties like this.  AFAICT it can error
> out, so why not just try_module_get() always?
> 
>   gameport.c, serio.c and input.c increment their own refcount, but to get
> into those init functions someone must be holding a refcount already (ie. a
> module depends on this module).  Ditto cyber2000fb.c, and MTD.
> 
>   mdio-bitbang.c should definitely use try_module_get.
> 
>   loop.c bumping its own refcount, Al might know why, but definitely can be
> try_module_get() if it's valid at all.
> 
>   net/socket.c can also handle failure, so that's another try_module_get.
> 
> etc.
> 
> > I think we should replace all unprotected __module_get() calls with
> > try_module_get(), or remove __module_get() completely.
> 
> Agreed.  We will need a "nail_module()" call for those legitimate uses (which
> should clear mod->exit, rather than manipulating the refcount at all).
> 
> Meanwhile, I'll remove the BUG_ON for 2.6.29.
> 
> Thanks,
> Rusty.
> 
> module: remove over-zealous check in __module_get()
> 
> module_refcount() isn't reliable outside stop_machine(), as demonstrated
> by Karsten Keil <kkeil@suse.de>, networking can trigger it under load
> (an inc on one cpu and dec on another while module_refcount() is tallying
>  can give false results, for example).
> 
> Almost noone should be using __module_get, but that's another issue.
> 
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> 
> diff --git a/include/linux/module.h b/include/linux/module.h
> --- a/include/linux/module.h
> +++ b/include/linux/module.h
> @@ -407,7 +407,6 @@ static inline void __module_get(struct m
>  static inline void __module_get(struct module *module)
>  {
>  	if (module) {
> -		BUG_ON(module_refcount(module) == 0);
>  		local_inc(__module_ref_addr(module, get_cpu()));
>  		put_cpu();
>  	}

Based on this change, would it make sense to update sys_accept to change
__module_get to try_module_get like in the following patch?


From 368c52b25414d1ccd0851d77fa5f20431487c172 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.cz>
Date: Mon, 9 Feb 2009 16:06:15 +0100
Subject: [PATCH] [NET] replace __module_get by try_module_get in accept4

After 7f9a50a5b89b87f8e754f59ae9968da28be618a5 we are not checking for
potential BUG in module reference counting. Therefore we should replace
__module_get by try_module_get and BUG if the module is being unloaded.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
---
 net/socket.c |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

Comments

Rusty Russell Feb. 10, 2009, 3:15 a.m. UTC | #1
On Tuesday 10 February 2009 01:48:31 Michal Hocko wrote:
> Based on this change, would it make sense to update sys_accept to change
> __module_get to try_module_get like in the following patch?

I don't think so:

>  	/*
> -	 * We don't need try_module_get here, as the listening socket (sock)
> -	 * has the protocol module (sock->ops->owner) held.
> +	 * Socket's owner cannot be in unloading path because there
> +	 * must be at least one listening reference
>  	 */
> -	__module_get(newsock->ops->owner);
> +	if (unlikely(!try_module_get(newsock->ops->owner)))
> +		BUG();

rmmod --wait can make try_module_get fail even if the reference count isn't
zero.  But in this case, we should return an error from the accept call;
presumably the admin really doesn't want us to keep using the module...

Thanks,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Karsten Keil Feb. 10, 2009, 3:42 a.m. UTC | #2
On Tue, Feb 10, 2009 at 01:45:07PM +1030, Rusty Russell wrote:
> On Tuesday 10 February 2009 01:48:31 Michal Hocko wrote:
> > Based on this change, would it make sense to update sys_accept to change
> > __module_get to try_module_get like in the following patch?
> 
> I don't think so:
> 
> >  	/*
> > -	 * We don't need try_module_get here, as the listening socket (sock)
> > -	 * has the protocol module (sock->ops->owner) held.
> > +	 * Socket's owner cannot be in unloading path because there
> > +	 * must be at least one listening reference
> >  	 */
> > -	__module_get(newsock->ops->owner);
> > +	if (unlikely(!try_module_get(newsock->ops->owner)))
> > +		BUG();
> 
> rmmod --wait can make try_module_get fail even if the reference count isn't
> zero.  But in this case, we should return an error from the accept call;
> presumably the admin really doesn't want us to keep using the module...
> 

I was thinking about this for a while too, but I did not find a valid
error value for this case, the current accept syscall API only allow few
error codes and I think we should not break this.
Maybe -ECONNABORTED  could be used but it doesn't fit 100% and applications
may interpret this in a wrong way.
If the admin decide to remove the module, he should also stop the services
needing this resource. In this case  the effect of __module_get(newsock->ops->owner);
is ok, if the service is still in use, the module will not be removed and
still does the expected work, if all sockets are closed the module refcount
will reach zero and the module will go away.
Michal Hocko Feb. 10, 2009, 10:31 a.m. UTC | #3
On Tue 10-02-09 13:45:07, Rusty Russell wrote:
> On Tuesday 10 February 2009 01:48:31 Michal Hocko wrote:
> > Based on this change, would it make sense to update sys_accept to change
> > __module_get to try_module_get like in the following patch?
> 
> I don't think so:
> 
> >  	/*
> > -	 * We don't need try_module_get here, as the listening socket (sock)
> > -	 * has the protocol module (sock->ops->owner) held.
> > +	 * Socket's owner cannot be in unloading path because there
> > +	 * must be at least one listening reference
> >  	 */
> > -	__module_get(newsock->ops->owner);
> > +	if (unlikely(!try_module_get(newsock->ops->owner)))
> > +		BUG();
> 
> rmmod --wait can make try_module_get fail even if the reference count isn't
> zero.  

OK, I though that rmmod --wait waits for refcount==0 and then changes
the state.

> But in this case, we should return an error from the accept call;
> presumably the admin really doesn't want us to keep using the module...
> 
> Thanks,
> Rusty.
Rusty Russell Feb. 10, 2009, 1:36 p.m. UTC | #4
On Tuesday 10 February 2009 21:01:37 Michal Hocko wrote:
> On Tue 10-02-09 13:45:07, Rusty Russell wrote:
> > On Tuesday 10 February 2009 01:48:31 Michal Hocko wrote:
> > > Based on this change, would it make sense to update sys_accept to change
> > > __module_get to try_module_get like in the following patch?
> > 
> > I don't think so:
> > 
> > >  	/*
> > > -	 * We don't need try_module_get here, as the listening socket (sock)
> > > -	 * has the protocol module (sock->ops->owner) held.
> > > +	 * Socket's owner cannot be in unloading path because there
> > > +	 * must be at least one listening reference
> > >  	 */
> > > -	__module_get(newsock->ops->owner);
> > > +	if (unlikely(!try_module_get(newsock->ops->owner)))
> > > +		BUG();
> > 
> > rmmod --wait can make try_module_get fail even if the reference count isn't
> > zero.  
> 
> OK, I though that rmmod --wait waits for refcount==0 and then changes
> the state.

No, it has to stop all future use, otherwise it's useless under load.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/socket.c b/net/socket.c
index 35dd737..d0d4c92 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1444,10 +1444,11 @@  SYSCALL_DEFINE4(accept4, int, fd, struct sockaddr __user *, upeer_sockaddr,
 	newsock->ops = sock->ops;
 
 	/*
-	 * We don't need try_module_get here, as the listening socket (sock)
-	 * has the protocol module (sock->ops->owner) held.
+	 * Socket's owner cannot be in unloading path because there
+	 * must be at least one listening reference
 	 */
-	__module_get(newsock->ops->owner);
+	if (unlikely(!try_module_get(newsock->ops->owner)))
+		BUG();
 
 	newfd = sock_alloc_fd(&newfile, flags & O_CLOEXEC);
 	if (unlikely(newfd < 0)) {