diff mbox

[v2] net: fix sleeping while atomic problem in sock mem_cgroup.

Message ID 1324027869.2273.3.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC
State Superseded, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Dec. 16, 2011, 9:31 a.m. UTC
Le vendredi 16 décembre 2011 à 13:09 +0400, Glauber Costa a écrit :
> Since we can't scan the proto_list to initialize sock cgroups, as it
> holds a rwlock, and we also want to keep the code generic enough to
> avoid calling the initialization functions of protocols directly,
> I propose we keep the interested parties in a separate list. This list
> is protected by a mutex so we can sleep and do the necessary allocations.
> 
> Also fixes a reference problem found by Randy Dunlap's randconfig.
> 
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> CC: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
> CC: David S. Miller <davem@davemloft.net>
> CC: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Stephen Rothwell <sfr@canb.auug.org.au>
> CC: Randy Dunlap <rdunlap@xenotime.net>
> ---

Sorry to come late, but why dont we convert proto_list_lock to a mutex ?

 net/core/sock.c |   28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Glauber Costa Dec. 16, 2011, 9:53 a.m. UTC | #1
On 12/16/2011 01:31 PM, Eric Dumazet wrote:
> Le vendredi 16 décembre 2011 à 13:09 +0400, Glauber Costa a écrit :
>> Since we can't scan the proto_list to initialize sock cgroups, as it
>> holds a rwlock, and we also want to keep the code generic enough to
>> avoid calling the initialization functions of protocols directly,
>> I propose we keep the interested parties in a separate list. This list
>> is protected by a mutex so we can sleep and do the necessary allocations.
>>
>> Also fixes a reference problem found by Randy Dunlap's randconfig.
>>
>> Signed-off-by: Glauber Costa<glommer@parallels.com>
>> CC: Hiroyouki Kamezawa<kamezawa.hiroyu@jp.fujitsu.com>
>> CC: David S. Miller<davem@davemloft.net>
>> CC: Eric Dumazet<eric.dumazet@gmail.com>
>> CC: Stephen Rothwell<sfr@canb.auug.org.au>
>> CC: Randy Dunlap<rdunlap@xenotime.net>
>> ---
>
> Sorry to come late, but why dont we convert proto_list_lock to a mutex ?

I didn't suggest this, as I imagined there could be some performance 
implications to be drawn from it that may not be obvious to me.

But if it is okay with you net guys, it is certainly okay with me as well.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Dec. 16, 2011, 10:04 a.m. UTC | #2
Le vendredi 16 décembre 2011 à 13:53 +0400, Glauber Costa a écrit :
> On 12/16/2011 01:31 PM, Eric Dumazet wrote:
> > Le vendredi 16 décembre 2011 à 13:09 +0400, Glauber Costa a écrit :
> >> Since we can't scan the proto_list to initialize sock cgroups, as it
> >> holds a rwlock, and we also want to keep the code generic enough to
> >> avoid calling the initialization functions of protocols directly,
> >> I propose we keep the interested parties in a separate list. This list
> >> is protected by a mutex so we can sleep and do the necessary allocations.
> >>
> >> Also fixes a reference problem found by Randy Dunlap's randconfig.
> >>
> >> Signed-off-by: Glauber Costa<glommer@parallels.com>
> >> CC: Hiroyouki Kamezawa<kamezawa.hiroyu@jp.fujitsu.com>
> >> CC: David S. Miller<davem@davemloft.net>
> >> CC: Eric Dumazet<eric.dumazet@gmail.com>
> >> CC: Stephen Rothwell<sfr@canb.auug.org.au>
> >> CC: Randy Dunlap<rdunlap@xenotime.net>
> >> ---
> >
> > Sorry to come late, but why dont we convert proto_list_lock to a mutex ?
> 
> I didn't suggest this, as I imagined there could be some performance 
> implications to be drawn from it that may not be obvious to me.
> 
> But if it is okay with you net guys, it is certainly okay with me as well.

This 'lock' is not performance sensitive, its very seldom taken.

If we really wanted to be fast, it would not be a rwlock anymore ;)

"cat /proc/net/protocols" could eventually use RCU locking if we want
parallelism. (I dont think its needed)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Glauber Costa Dec. 16, 2011, 10:05 a.m. UTC | #3
On 12/16/2011 02:04 PM, Eric Dumazet wrote:
> Le vendredi 16 décembre 2011 à 13:53 +0400, Glauber Costa a écrit :
>> On 12/16/2011 01:31 PM, Eric Dumazet wrote:
>>> Le vendredi 16 décembre 2011 à 13:09 +0400, Glauber Costa a écrit :
>>>> Since we can't scan the proto_list to initialize sock cgroups, as it
>>>> holds a rwlock, and we also want to keep the code generic enough to
>>>> avoid calling the initialization functions of protocols directly,
>>>> I propose we keep the interested parties in a separate list. This list
>>>> is protected by a mutex so we can sleep and do the necessary allocations.
>>>>
>>>> Also fixes a reference problem found by Randy Dunlap's randconfig.
>>>>
>>>> Signed-off-by: Glauber Costa<glommer@parallels.com>
>>>> CC: Hiroyouki Kamezawa<kamezawa.hiroyu@jp.fujitsu.com>
>>>> CC: David S. Miller<davem@davemloft.net>
>>>> CC: Eric Dumazet<eric.dumazet@gmail.com>
>>>> CC: Stephen Rothwell<sfr@canb.auug.org.au>
>>>> CC: Randy Dunlap<rdunlap@xenotime.net>
>>>> ---
>>>
>>> Sorry to come late, but why dont we convert proto_list_lock to a mutex ?
>>
>> I didn't suggest this, as I imagined there could be some performance
>> implications to be drawn from it that may not be obvious to me.
>>
>> But if it is okay with you net guys, it is certainly okay with me as well.
>
> This 'lock' is not performance sensitive, its very seldom taken.
>
> If we really wanted to be fast, it would not be a rwlock anymore ;)
>
> "cat /proc/net/protocols" could eventually use RCU locking if we want
> parallelism. (I dont think its needed)
>

Well, in this case, I myself think this is a better solution. Do you 
want to push the patch yourself, or should I do it ?


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Dec. 16, 2011, 10:10 a.m. UTC | #4
Le vendredi 16 décembre 2011 à 14:05 +0400, Glauber Costa a écrit :

> Well, in this case, I myself think this is a better solution. Do you 
> want to push the patch yourself, or should I do it ?
> 

Please do it, I'll add my SOB after yours, thanks !



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/sock.c b/net/core/sock.c
index 5a6a906..0c67fac 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -136,7 +136,7 @@ 
 #include <net/tcp.h>
 #endif
 
-static DEFINE_RWLOCK(proto_list_lock);
+static DEFINE_MUTEX(proto_list_mutex);
 static LIST_HEAD(proto_list);
 
 #ifdef CONFIG_CGROUP_MEM_RES_CTLR_KMEM
@@ -145,7 +145,7 @@  int mem_cgroup_sockets_init(struct cgroup *cgrp, struct cgroup_subsys *ss)
 	struct proto *proto;
 	int ret = 0;
 
-	read_lock(&proto_list_lock);
+	mutex_lock(&proto_list_mutex);
 	list_for_each_entry(proto, &proto_list, node) {
 		if (proto->init_cgroup) {
 			ret = proto->init_cgroup(cgrp, ss);
@@ -154,13 +154,13 @@  int mem_cgroup_sockets_init(struct cgroup *cgrp, struct cgroup_subsys *ss)
 		}
 	}
 
-	read_unlock(&proto_list_lock);
+	mutex_unlock(&proto_list_mutex);
 	return ret;
 out:
 	list_for_each_entry_continue_reverse(proto, &proto_list, node)
 		if (proto->destroy_cgroup)
 			proto->destroy_cgroup(cgrp, ss);
-	read_unlock(&proto_list_lock);
+	mutex_unlock(&proto_list_mutex);
 	return ret;
 }
 
@@ -168,11 +168,11 @@  void mem_cgroup_sockets_destroy(struct cgroup *cgrp, struct cgroup_subsys *ss)
 {
 	struct proto *proto;
 
-	read_lock(&proto_list_lock);
+	mutex_lock(&proto_list_mutex);
 	list_for_each_entry_reverse(proto, &proto_list, node)
 		if (proto->destroy_cgroup)
 			proto->destroy_cgroup(cgrp, ss);
-	read_unlock(&proto_list_lock);
+	mutex_unlock(&proto_list_mutex);
 }
 #endif
 
@@ -2479,10 +2479,10 @@  int proto_register(struct proto *prot, int alloc_slab)
 		}
 	}
 
-	write_lock(&proto_list_lock);
+	mutex_lock(&proto_list_mutex);
 	list_add(&prot->node, &proto_list);
 	assign_proto_idx(prot);
-	write_unlock(&proto_list_lock);
+	mutex_unlock(&proto_list_mutex);
 	return 0;
 
 out_free_timewait_sock_slab_name:
@@ -2505,10 +2505,10 @@  EXPORT_SYMBOL(proto_register);
 
 void proto_unregister(struct proto *prot)
 {
-	write_lock(&proto_list_lock);
+	mutex_lock(&proto_list_mutex);
 	release_proto_idx(prot);
 	list_del(&prot->node);
-	write_unlock(&proto_list_lock);
+	mutex_unlock(&proto_list_mutex);
 
 	if (prot->slab != NULL) {
 		kmem_cache_destroy(prot->slab);
@@ -2531,9 +2531,9 @@  EXPORT_SYMBOL(proto_unregister);
 
 #ifdef CONFIG_PROC_FS
 static void *proto_seq_start(struct seq_file *seq, loff_t *pos)
-	__acquires(proto_list_lock)
+	__acquires(proto_list_mutex)
 {
-	read_lock(&proto_list_lock);
+	mutex_lock(&proto_list_mutex);
 	return seq_list_start_head(&proto_list, *pos);
 }
 
@@ -2543,9 +2543,9 @@  static void *proto_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 }
 
 static void proto_seq_stop(struct seq_file *seq, void *v)
-	__releases(proto_list_lock)
+	__releases(proto_list_mutex)
 {
-	read_unlock(&proto_list_lock);
+	mutex_unlock(&proto_list_mutex);
 }
 
 static char proto_method_implemented(const void *method)