diff mbox

[V2] bonding: fix netpoll in active-backup mode

Message ID 20110308212405.GX11864@gospo.rdu.redhat.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Andy Gospodarek March 8, 2011, 9:24 p.m. UTC
On Tue, Mar 08, 2011 at 05:58:56PM +0800, Amerigo Wang wrote:
> V2: avoid calling slave_diable_netpoll() with write_lock_bh() held.
> 
> netconsole doesn't work in active-backup mode, because we don't do anything
> for nic failover in active-backup mode. We should disable netpoll on the
> failing slave when it is detected down and enable netpoll when it becomes
> the active slave.
> 
> Tested by ifdown the current active slave and ifup it again for several times,
> netconsole works well.
> 
> Signed-off-by: WANG Cong <amwang@redhat.com>
> Cc: Neil Horman <nhorman@tuxdriver.com>
> 

It seems like you are going to a lot of trouble to fix a bug where
netpoll will not be setup on any interface that is down when enslaved.
That seems to be the only path that would not have slave->np setup
properly at enslavement.

Did you ever try just this?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Neil Horman March 8, 2011, 9:41 p.m. UTC | #1
On Tue, Mar 08, 2011 at 04:24:05PM -0500, Andy Gospodarek wrote:
> On Tue, Mar 08, 2011 at 05:58:56PM +0800, Amerigo Wang wrote:
> > V2: avoid calling slave_diable_netpoll() with write_lock_bh() held.
> > 
> > netconsole doesn't work in active-backup mode, because we don't do anything
> > for nic failover in active-backup mode. We should disable netpoll on the
> > failing slave when it is detected down and enable netpoll when it becomes
> > the active slave.
> > 
> > Tested by ifdown the current active slave and ifup it again for several times,
> > netconsole works well.
> > 
> > Signed-off-by: WANG Cong <amwang@redhat.com>
> > Cc: Neil Horman <nhorman@tuxdriver.com>
> > 
> 
> It seems like you are going to a lot of trouble to fix a bug where
> netpoll will not be setup on any interface that is down when enslaved.
> That seems to be the only path that would not have slave->np setup
> properly at enslavement.
> 
> Did you ever try just this?
> 
+1, this is what I was suggesting.  Just setup netpoll on all the slaves
regardless of status, and let their netpoll state follow that of the bond.
Neil

> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 0592e6d..8d93044 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -1352,8 +1352,6 @@ static int bond_netpoll_setup(struct net_device *dev, struct netpoll_info *ni)
>  
>  	read_lock(&bond->lock);
>  	bond_for_each_slave(bond, slave, i) {
> -		if (!IS_UP(slave->dev))
> -			continue;
>  		err = slave_enable_netpoll(slave);
>  		if (err) {
>  			__bond_netpoll_cleanup(bond);
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Amerigo Wang March 9, 2011, 12:34 p.m. UTC | #2
于 2011年03月09日 05:24, Andy Gospodarek 写道:
> On Tue, Mar 08, 2011 at 05:58:56PM +0800, Amerigo Wang wrote:
>> V2: avoid calling slave_diable_netpoll() with write_lock_bh() held.
>>
>> netconsole doesn't work in active-backup mode, because we don't do anything
>> for nic failover in active-backup mode. We should disable netpoll on the
>> failing slave when it is detected down and enable netpoll when it becomes
>> the active slave.
>>
>> Tested by ifdown the current active slave and ifup it again for several times,
>> netconsole works well.
>>
>> Signed-off-by: WANG Cong<amwang@redhat.com>
>> Cc: Neil Horman<nhorman@tuxdriver.com>
>>
>
> It seems like you are going to a lot of trouble to fix a bug where
> netpoll will not be setup on any interface that is down when enslaved.
> That seems to be the only path that would not have slave->np setup
> properly at enslavement.
>
> Did you ever try just this?

That was my first thought, but I was over-worried about the failing slave.
This way should work too. Mind to send it as a normal patch? :)

Thanks!

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Gospodarek March 11, 2011, 2:35 p.m. UTC | #3
On Wed, Mar 09, 2011 at 08:34:43PM +0800, Cong Wang wrote:
> 于 2011年03月09日 05:24, Andy Gospodarek 写道:
>> On Tue, Mar 08, 2011 at 05:58:56PM +0800, Amerigo Wang wrote:
>>> V2: avoid calling slave_diable_netpoll() with write_lock_bh() held.
>>>
>>> netconsole doesn't work in active-backup mode, because we don't do anything
>>> for nic failover in active-backup mode. We should disable netpoll on the
>>> failing slave when it is detected down and enable netpoll when it becomes
>>> the active slave.
>>>
>>> Tested by ifdown the current active slave and ifup it again for several times,
>>> netconsole works well.
>>>
>>> Signed-off-by: WANG Cong<amwang@redhat.com>
>>> Cc: Neil Horman<nhorman@tuxdriver.com>
>>>
>>
>> It seems like you are going to a lot of trouble to fix a bug where
>> netpoll will not be setup on any interface that is down when enslaved.
>> That seems to be the only path that would not have slave->np setup
>> properly at enslavement.
>>
>> Did you ever try just this?
>
> That was my first thought, but I was over-worried about the failing slave.
> This way should work too. Mind to send it as a normal patch? :)
>

I'm happy to submit the patch if it works in your environment.

I do not think anyone likes un-tested patches.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Amerigo Wang March 14, 2011, 6:04 a.m. UTC | #4
于 2011年03月11日 22:35, Andy Gospodarek 写道:
>
> I'm happy to submit the patch if it works in your environment.
>
> I do not think anyone likes un-tested patches.
>

Just test it by hands, it works, you can add

Tested-by: WANG Cong <amwang@redhat.com>

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 0592e6d..8d93044 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1352,8 +1352,6 @@  static int bond_netpoll_setup(struct net_device *dev, struct netpoll_info *ni)
 
 	read_lock(&bond->lock);
 	bond_for_each_slave(bond, slave, i) {
-		if (!IS_UP(slave->dev))
-			continue;
 		err = slave_enable_netpoll(slave);
 		if (err) {
 			__bond_netpoll_cleanup(bond);