diff mbox

[net-next] ipv4: get rid of ip_ra_lock

Message ID 1493240115-779-1-git-send-email-xiyou.wangcong@gmail.com
State Superseded, archived
Delegated to: David Miller
Headers show

Commit Message

Cong Wang April 26, 2017, 8:55 p.m. UTC
After commit 1215e51edad1 ("ipv4: fix a deadlock in ip_ra_control")
we always take RTNL lock for ip_ra_control() which is the only place
we update the list ip_ra_chain, so the ip_ra_lock is no longer needed,
we just need to disable BH there.

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/ipv4/ip_sockglue.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

Comments

Eric Dumazet April 27, 2017, 12:46 p.m. UTC | #1
On Wed, 2017-04-26 at 13:55 -0700, Cong Wang wrote:
> After commit 1215e51edad1 ("ipv4: fix a deadlock in ip_ra_control")
> we always take RTNL lock for ip_ra_control() which is the only place
> we update the list ip_ra_chain, so the ip_ra_lock is no longer needed,
> we just need to disable BH there.
> 
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> ---

Looks great, but reading again this code, I believe we do not need to
disable BH at all ?

Thanks.
Cong Wang April 27, 2017, 11:46 p.m. UTC | #2
On Thu, Apr 27, 2017 at 5:46 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Wed, 2017-04-26 at 13:55 -0700, Cong Wang wrote:
>> After commit 1215e51edad1 ("ipv4: fix a deadlock in ip_ra_control")
>> we always take RTNL lock for ip_ra_control() which is the only place
>> we update the list ip_ra_chain, so the ip_ra_lock is no longer needed,
>> we just need to disable BH there.
>>
>> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>> ---
>
> Looks great, but reading again this code, I believe we do not need to
> disable BH at all ?
>

Hmm, if we don't disable BH here, a reader in BH could jump in and
break this critical section? Or that is fine for RCU?
Eric Dumazet April 27, 2017, 11:54 p.m. UTC | #3
On Thu, 2017-04-27 at 16:46 -0700, Cong Wang wrote:
> On Thu, Apr 27, 2017 at 5:46 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Wed, 2017-04-26 at 13:55 -0700, Cong Wang wrote:
> >> After commit 1215e51edad1 ("ipv4: fix a deadlock in ip_ra_control")
> >> we always take RTNL lock for ip_ra_control() which is the only place
> >> we update the list ip_ra_chain, so the ip_ra_lock is no longer needed,
> >> we just need to disable BH there.
> >>
> >> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> >> ---
> >
> > Looks great, but reading again this code, I believe we do not need to
> > disable BH at all ?
> >
> 
> Hmm, if we don't disable BH here, a reader in BH could jump in and
> break this critical section? Or that is fine for RCU?

It should be fine for RCU.

The spinlock (or mutex if this is RTNL) is protecting writers among
themselves. Here it should run in process context, with no specific
rules to disable preemption, hard or soft irqs.

The reader(s) do not care of how writer(s) enforce their mutual
protection, and if writer(s) disable hard or soft irqs.
Cong Wang April 28, 2017, 4:54 p.m. UTC | #4
On Thu, Apr 27, 2017 at 4:54 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2017-04-27 at 16:46 -0700, Cong Wang wrote:
>> On Thu, Apr 27, 2017 at 5:46 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> > On Wed, 2017-04-26 at 13:55 -0700, Cong Wang wrote:
>> >> After commit 1215e51edad1 ("ipv4: fix a deadlock in ip_ra_control")
>> >> we always take RTNL lock for ip_ra_control() which is the only place
>> >> we update the list ip_ra_chain, so the ip_ra_lock is no longer needed,
>> >> we just need to disable BH there.
>> >>
>> >> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>> >> ---
>> >
>> > Looks great, but reading again this code, I believe we do not need to
>> > disable BH at all ?
>> >
>>
>> Hmm, if we don't disable BH here, a reader in BH could jump in and
>> break this critical section? Or that is fine for RCU?
>
> It should be fine for RCU.
>
> The spinlock (or mutex if this is RTNL) is protecting writers among
> themselves. Here it should run in process context, with no specific
> rules to disable preemption, hard or soft irqs.
>
> The reader(s) do not care of how writer(s) enforce their mutual
> protection, and if writer(s) disable hard or soft irqs.

Fair enough! This refreshes my understanding of RCU.

I will send V2. The ASSERT_RTNL() is unnecessary too, as
we already have one in rcu_dereference_protected().

Thanks!
diff mbox

Patch

diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index 1d46d05..2923ea1 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -330,7 +330,6 @@  int ip_cmsg_send(struct sock *sk, struct msghdr *msg, struct ipcm_cookie *ipc,
    sent to multicast group to reach destination designated router.
  */
 struct ip_ra_chain __rcu *ip_ra_chain;
-static DEFINE_SPINLOCK(ip_ra_lock);
 
 
 static void ip_ra_destroy_rcu(struct rcu_head *head)
@@ -352,21 +351,21 @@  int ip_ra_control(struct sock *sk, unsigned char on,
 
 	new_ra = on ? kmalloc(sizeof(*new_ra), GFP_KERNEL) : NULL;
 
-	spin_lock_bh(&ip_ra_lock);
+	ASSERT_RTNL();
+	local_bh_disable();
 	for (rap = &ip_ra_chain;
-	     (ra = rcu_dereference_protected(*rap,
-			lockdep_is_held(&ip_ra_lock))) != NULL;
+	     (ra = rtnl_dereference(*rap)) != NULL;
 	     rap = &ra->next) {
 		if (ra->sk == sk) {
 			if (on) {
-				spin_unlock_bh(&ip_ra_lock);
+				local_bh_enable();
 				kfree(new_ra);
 				return -EADDRINUSE;
 			}
 			/* dont let ip_call_ra_chain() use sk again */
 			ra->sk = NULL;
 			RCU_INIT_POINTER(*rap, ra->next);
-			spin_unlock_bh(&ip_ra_lock);
+			local_bh_enable();
 
 			if (ra->destructor)
 				ra->destructor(sk);
@@ -381,7 +380,7 @@  int ip_ra_control(struct sock *sk, unsigned char on,
 		}
 	}
 	if (!new_ra) {
-		spin_unlock_bh(&ip_ra_lock);
+		local_bh_enable();
 		return -ENOBUFS;
 	}
 	new_ra->sk = sk;
@@ -390,7 +389,7 @@  int ip_ra_control(struct sock *sk, unsigned char on,
 	RCU_INIT_POINTER(new_ra->next, ra);
 	rcu_assign_pointer(*rap, new_ra);
 	sock_hold(sk);
-	spin_unlock_bh(&ip_ra_lock);
+	local_bh_enable();
 
 	return 0;
 }