diff mbox

[1/2] net: Fix sysctl restarts...

Message ID m14olcu7xo.fsf@fess.ebiederm.org
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Eric W. Biederman Feb. 19, 2010, 11:22 p.m. UTC
Yuck.  It turns out that when we restart sysctls we were restarting
with the values already changed.  Which unfortunately meant that
the second time through we thought there was no change and skipped
all kinds of work, despite the fact that there was indeed a change.

I have fixed this the simplest way possible by restoring the changed
values when we restart the sysctl write.

One of my coworkers spotted this bug when after disabling forwarding
on an interface pings were still forwarded.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 net/ipv4/devinet.c  |    7 ++++++-
 net/ipv6/addrconf.c |   16 ++++++++++++++--
 2 files changed, 20 insertions(+), 3 deletions(-)

Comments

David Miller Feb. 19, 2010, 11:29 p.m. UTC | #1
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Fri, 19 Feb 2010 15:22:59 -0800

> 
> Yuck.  It turns out that when we restart sysctls we were restarting
> with the values already changed.  Which unfortunately meant that
> the second time through we thought there was no change and skipped
> all kinds of work, despite the fact that there was indeed a change.
> 
> I have fixed this the simplest way possible by restoring the changed
> values when we restart the sysctl write.
> 
> One of my coworkers spotted this bug when after disabling forwarding
> on an interface pings were still forwarded.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

What commit added this bug?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Feb. 19, 2010, 11:35 p.m. UTC | #2
David Miller <davem@davemloft.net> writes:

> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Fri, 19 Feb 2010 15:22:59 -0800
>
>> 
>> Yuck.  It turns out that when we restart sysctls we were restarting
>> with the values already changed.  Which unfortunately meant that
>> the second time through we thought there was no change and skipped
>> all kinds of work, despite the fact that there was indeed a change.
>> 
>> I have fixed this the simplest way possible by restoring the changed
>> values when we restart the sysctl write.
>> 
>> One of my coworkers spotted this bug when after disabling forwarding
>> on an interface pings were still forwarded.
>> 
>> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
>
> What commit added this bug?

When we I fixed the deadlock that can happen if you write to forwarding
while removing the device.  The deadlock was fixed, the restart worked
but I somehow missed the fact that proc_dointvec modifies state and so
defeated the change detection.  *embarrassing*

commit 9b8adb5ea005fe73acd5dd58f9bd47eafa74c9d1
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Wed May 13 16:59:21 2009 +0000

    net: Fix devinet_sysctl_forward
    
    sysctls are unregistered with the rntl_lock held making
    it unsafe to unconditionally grab the the rtnl_lock.  Instead
    we need to call rtnl_trylock and restart the system call
    if we can not grab it.  Otherwise we could deadlock at unregistration
    time.
    
    Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Feb. 19, 2010, 11:41 p.m. UTC | #3
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Fri, 19 Feb 2010 15:35:27 -0800

> When we I fixed the deadlock that can happen if you write to forwarding
> while removing the device.  The deadlock was fixed, the restart worked
> but I somehow missed the fact that proc_dointvec modifies state and so
> defeated the change detection.  *embarrassing*

Ok, I'll have to push these around to Linus and a couple -stable
releases.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Feb. 19, 2010, 11:58 p.m. UTC | #4
David Miller <davem@davemloft.net> writes:

2> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Fri, 19 Feb 2010 15:35:27 -0800
>
>> When we I fixed the deadlock that can happen if you write to forwarding
>> while removing the device.  The deadlock was fixed, the restart worked
>> but I somehow missed the fact that proc_dointvec modifies state and so
>> defeated the change detection.  *embarrassing*
>
> Ok, I'll have to push these around to Linus and a couple -stable
> releases.

The second patch fixes an issue which isn't quite as old.

I caught it when I was looking for other rtnl_lock issues that
I may have missed.  Thankfully the worst sysfs does is re-read
the string from userspace on a restart so none of the sysfs
rtnl_trylock cases have a nasty deadlock associated.

Eric


commit a160ee69c6a4622ed30c377a978554015e9931cb
Author: Johannes Berg <johannes@sipsolutions.net>
Date:   Mon Oct 5 02:22:23 2009 -0700

    wext: let get_wireless_stats() sleep
    
    A number of drivers (recently including cfg80211-based ones)
    assume that all wireless handlers, including statistics, can
    sleep and they often also implicitly assume that the rtnl is
    held around their invocation. This is almost always true now
    except when reading from sysfs:
    
      BUG: sleeping function called from invalid context at kernel/mutex.c:280
      in_atomic(): 1, irqs_disabled(): 0, pid: 10450, name: head
      2 locks held by head/10450:
       #0:  (&buffer->mutex){+.+.+.}, at: [<c10ceb99>] sysfs_read_file+0x24/0xf4
       #1:  (dev_base_lock){++.?..}, at: [<c12844ee>] wireless_show+0x1a/0x4c
      Pid: 10450, comm: head Not tainted 2.6.32-rc3 #1
      Call Trace:
       [<c102301c>] __might_sleep+0xf0/0xf7
       [<c1324355>] mutex_lock_nested+0x1a/0x33
       [<f8cea53b>] wdev_lock+0xd/0xf [cfg80211]
       [<f8cea58f>] cfg80211_wireless_stats+0x45/0x12d [cfg80211]
       [<c13118d6>] get_wireless_stats+0x16/0x1c
       [<c12844fe>] wireless_show+0x2a/0x4c
    
    Fix this by using the rtnl instead of dev_base_lock.
    
    Reported-by: Miles Lane <miles.lane@gmail.com>
    Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Feb. 20, 2010, 12:02 a.m. UTC | #5
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Fri, 19 Feb 2010 15:58:53 -0800

> David Miller <davem@davemloft.net> writes:
> 
> 2> From: ebiederm@xmission.com (Eric W. Biederman)
>> Date: Fri, 19 Feb 2010 15:35:27 -0800
>>
>>> When we I fixed the deadlock that can happen if you write to forwarding
>>> while removing the device.  The deadlock was fixed, the restart worked
>>> but I somehow missed the fact that proc_dointvec modifies state and so
>>> defeated the change detection.  *embarrassing*
>>
>> Ok, I'll have to push these around to Linus and a couple -stable
>> releases.
> 
> The second patch fixes an issue which isn't quite as old.
> 
> I caught it when I was looking for other rtnl_lock issues that
> I may have missed.  Thankfully the worst sysfs does is re-read
> the string from userspace on a restart so none of the sysfs
> rtnl_trylock cases have a nasty deadlock associated.
> 
> Eric
> 
> 
> commit a160ee69c6a4622ed30c377a978554015e9931cb
> Author: Johannes Berg <johannes@sipsolutions.net>
> Date:   Mon Oct 5 02:22:23 2009 -0700

So the second patch needs to go to less -stable releases than the
other one.  Thanks for the info.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 014982b..51ca946 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -1317,14 +1317,19 @@  static int devinet_sysctl_forward(ctl_table *ctl, int write,
 {
 	int *valp = ctl->data;
 	int val = *valp;
+	loff_t pos = *ppos;
 	int ret = proc_dointvec(ctl, write, buffer, lenp, ppos);
 
 	if (write && *valp != val) {
 		struct net *net = ctl->extra2;
 
 		if (valp != &IPV4_DEVCONF_DFLT(net, FORWARDING)) {
-			if (!rtnl_trylock())
+			if (!rtnl_trylock()) {
+				/* Restore the original values before restarting */
+				*valp = val;
+				*ppos = pos;
 				return restart_syscall();
+			}
 			if (valp == &IPV4_DEVCONF_ALL(net, FORWARDING)) {
 				inet_forward_change(net);
 			} else if (*valp) {
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index b0d4a4b..5bcf0d3 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -502,8 +502,11 @@  static int addrconf_fixup_forwarding(struct ctl_table *table, int *p, int old)
 	if (p == &net->ipv6.devconf_dflt->forwarding)
 		return 0;
 
-	if (!rtnl_trylock())
+	if (!rtnl_trylock()) {
+		/* Restore the original values before restarting */
+		*p = old; 
 		return restart_syscall();
+	}
 
 	if (p == &net->ipv6.devconf_all->forwarding) {
 		__s32 newf = net->ipv6.devconf_all->forwarding;
@@ -4042,12 +4045,15 @@  int addrconf_sysctl_forward(ctl_table *ctl, int write,
 {
 	int *valp = ctl->data;
 	int val = *valp;
+	loff_t pos = *ppos;
 	int ret;
 
 	ret = proc_dointvec(ctl, write, buffer, lenp, ppos);
 
 	if (write)
 		ret = addrconf_fixup_forwarding(ctl, valp, val);
+	if (ret)
+		*ppos = pos;
 	return ret;
 }
 
@@ -4089,8 +4095,11 @@  static int addrconf_disable_ipv6(struct ctl_table *table, int *p, int old)
 	if (p == &net->ipv6.devconf_dflt->disable_ipv6)
 		return 0;
 
-	if (!rtnl_trylock())
+	if (!rtnl_trylock()) {
+		/* Restore the original values before restarting */
+		*p = old; 
 		return restart_syscall();
+	}
 
 	if (p == &net->ipv6.devconf_all->disable_ipv6) {
 		__s32 newf = net->ipv6.devconf_all->disable_ipv6;
@@ -4109,12 +4118,15 @@  int addrconf_sysctl_disable(ctl_table *ctl, int write,
 {
 	int *valp = ctl->data;
 	int val = *valp;
+	loff_t pos = *ppos;
 	int ret;
 
 	ret = proc_dointvec(ctl, write, buffer, lenp, ppos);
 
 	if (write)
 		ret = addrconf_disable_ipv6(ctl, valp, val);
+	if (ret)
+		*ppos = pos;
 	return ret;
 }