diff mbox

bonding: Don't allow mode change via sysfs with slaves present

Message ID 1321375482-8637-1-git-send-email-vfalico@redhat.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Veaceslav Falico Nov. 15, 2011, 4:44 p.m. UTC
When changing mode via bonding's sysfs, the slaves are not initialized
correctly. Forbid to change modes with slaves present to ensure that every
slave is initialized correctly via bond_enslave().

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bond_sysfs.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

Comments

Andy Gospodarek Nov. 15, 2011, 5 p.m. UTC | #1
On Tue, Nov 15, 2011 at 05:44:42PM +0100, Veaceslav Falico wrote:
> When changing mode via bonding's sysfs, the slaves are not initialized
> correctly. Forbid to change modes with slaves present to ensure that every
> slave is initialized correctly via bond_enslave().
> 
> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>

Looks good.  This behavior forces someone who wants to change to mode to
go through steps that are almost as destructive as when module options
are used to configure the mode.  I do not see a problem with this.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas de Pesloüan Nov. 15, 2011, 7:24 p.m. UTC | #2
Le 15/11/2011 18:00, Andy Gospodarek a écrit :
> On Tue, Nov 15, 2011 at 05:44:42PM +0100, Veaceslav Falico wrote:
>> When changing mode via bonding's sysfs, the slaves are not initialized
>> correctly. Forbid to change modes with slaves present to ensure that every
>> slave is initialized correctly via bond_enslave().
>>
>> Signed-off-by: Veaceslav Falico<vfalico@redhat.com>
>
> Looks good.  This behavior forces someone who wants to change to mode to
> go through steps that are almost as destructive as when module options
> are used to configure the mode.  I do not see a problem with this.

Except the fact that is enforce one more constraint on the exact order one should write into sysfs 
to setup a bonding interface. We already have many such constraints and probably don't need more.

Currently, it is possible to enslave slaves before selecting the mode. The ifenslave-2.6 package 
from Debian currently enslave slaves before setting the mode and would break with this change.

NAK.

	Nicolas.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings Nov. 15, 2011, 7:33 p.m. UTC | #3
On Tue, 2011-11-15 at 20:24 +0100, Nicolas de Pesloüan wrote:
> Le 15/11/2011 18:00, Andy Gospodarek a écrit :
> > On Tue, Nov 15, 2011 at 05:44:42PM +0100, Veaceslav Falico wrote:
> >> When changing mode via bonding's sysfs, the slaves are not initialized
> >> correctly. Forbid to change modes with slaves present to ensure that every
> >> slave is initialized correctly via bond_enslave().
> >>
> >> Signed-off-by: Veaceslav Falico<vfalico@redhat.com>
> >
> > Looks good.  This behavior forces someone who wants to change to mode to
> > go through steps that are almost as destructive as when module options
> > are used to configure the mode.  I do not see a problem with this.
> 
> Except the fact that is enforce one more constraint on the exact order one should write into sysfs 
> to setup a bonding interface. We already have many such constraints and probably don't need more.

From the administrator perspective, perhaps.  From the developer
perspective, the current flexibility of bonding makes it very difficult
to test and maintain.

> Currently, it is possible to enslave slaves before selecting the mode. The ifenslave-2.6 package 
> from Debian currently enslave slaves before setting the mode and would break with this change.
> 
> NAK.

It sounds like this feature has to be kept and fixed, then.

Ben.
Andy Gospodarek Nov. 15, 2011, 7:35 p.m. UTC | #4
On Tue, Nov 15, 2011 at 08:24:29PM +0100, Nicolas de Pesloüan wrote:
> Le 15/11/2011 18:00, Andy Gospodarek a écrit :
>> On Tue, Nov 15, 2011 at 05:44:42PM +0100, Veaceslav Falico wrote:
>>> When changing mode via bonding's sysfs, the slaves are not initialized
>>> correctly. Forbid to change modes with slaves present to ensure that every
>>> slave is initialized correctly via bond_enslave().
>>>
>>> Signed-off-by: Veaceslav Falico<vfalico@redhat.com>
>>
>> Looks good.  This behavior forces someone who wants to change to mode to
>> go through steps that are almost as destructive as when module options
>> are used to configure the mode.  I do not see a problem with this.
>
> Except the fact that is enforce one more constraint on the exact order 
> one should write into sysfs to setup a bonding interface. We already have 
> many such constraints and probably don't need more.
>
> Currently, it is possible to enslave slaves before selecting the mode. 
> The ifenslave-2.6 package from Debian currently enslave slaves before 
> setting the mode and would break with this change.
>

Our testing indicates that 802.3ad mode bonding will not work unless the
devices are enslaved after the mode is set.  Does this mean that no one
using Debian is using 802.2ad mode or are they just not reporting it?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas de Pesloüan Nov. 15, 2011, 8:02 p.m. UTC | #5
Le 15/11/2011 20:35, Andy Gospodarek a écrit :
> On Tue, Nov 15, 2011 at 08:24:29PM +0100, Nicolas de Pesloüan wrote:
>> Le 15/11/2011 18:00, Andy Gospodarek a écrit :
>>> On Tue, Nov 15, 2011 at 05:44:42PM +0100, Veaceslav Falico wrote:
>>>> When changing mode via bonding's sysfs, the slaves are not initialized
>>>> correctly. Forbid to change modes with slaves present to ensure that every
>>>> slave is initialized correctly via bond_enslave().
>>>>
>>>> Signed-off-by: Veaceslav Falico<vfalico@redhat.com>
>>>
>>> Looks good.  This behavior forces someone who wants to change to mode to
>>> go through steps that are almost as destructive as when module options
>>> are used to configure the mode.  I do not see a problem with this.
>>
>> Except the fact that is enforce one more constraint on the exact order
>> one should write into sysfs to setup a bonding interface. We already have
>> many such constraints and probably don't need more.
>>
>> Currently, it is possible to enslave slaves before selecting the mode.
>> The ifenslave-2.6 package from Debian currently enslave slaves before
>> setting the mode and would break with this change.
>>
>
> Our testing indicates that 802.3ad mode bonding will not work unless the
> devices are enslaved after the mode is set.  Does this mean that no one
> using Debian is using 802.2ad mode or are they just not reporting it?

I don't know. Possibly, they setup the bonding by hand, instead of relying on the bonding extensions 
of /etc/network/interfaces provided by the ifenslave-2.6 package.

Having a look at popularity for the package (http://qa.debian.org/popcon.php?package=ifenslave-2.6), 
it is obviously not the most popular one, but...

To try and fix the problem you noticed, if might be desirable to assert carrier off then on for all 
slaves when entering 802.3ad mode.

	Nicolas
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Gospodarek Nov. 15, 2011, 8:47 p.m. UTC | #6
On Tue, Nov 15, 2011 at 09:02:24PM +0100, Nicolas de Pesloüan wrote:
> Le 15/11/2011 20:35, Andy Gospodarek a écrit :
>> On Tue, Nov 15, 2011 at 08:24:29PM +0100, Nicolas de Pesloüan wrote:
>>> Le 15/11/2011 18:00, Andy Gospodarek a écrit :
>>>> On Tue, Nov 15, 2011 at 05:44:42PM +0100, Veaceslav Falico wrote:
>>>>> When changing mode via bonding's sysfs, the slaves are not initialized
>>>>> correctly. Forbid to change modes with slaves present to ensure that every
>>>>> slave is initialized correctly via bond_enslave().
>>>>>
>>>>> Signed-off-by: Veaceslav Falico<vfalico@redhat.com>
>>>>
>>>> Looks good.  This behavior forces someone who wants to change to mode to
>>>> go through steps that are almost as destructive as when module options
>>>> are used to configure the mode.  I do not see a problem with this.
>>>
>>> Except the fact that is enforce one more constraint on the exact order
>>> one should write into sysfs to setup a bonding interface. We already have
>>> many such constraints and probably don't need more.
>>>
>>> Currently, it is possible to enslave slaves before selecting the mode.
>>> The ifenslave-2.6 package from Debian currently enslave slaves before
>>> setting the mode and would break with this change.
>>>
>>
>> Our testing indicates that 802.3ad mode bonding will not work unless the
>> devices are enslaved after the mode is set.  Does this mean that no one
>> using Debian is using 802.2ad mode or are they just not reporting it?
>
> I don't know. Possibly, they setup the bonding by hand, instead of 
> relying on the bonding extensions of /etc/network/interfaces provided by 
> the ifenslave-2.6 package.
>
> Having a look at popularity for the package 
> (http://qa.debian.org/popcon.php?package=ifenslave-2.6), it is obviously 
> not the most popular one, but...
>

Nicolas,

I took a look at the ifenslave package for debian more closely and it
actually looks like devices are enslaved last, after mode is set.  Can
you please take a look at this package and confirm what I'm seeing in
the 'pre-up' script.

It appears to me that setup_master sets the mode and enslave_slaves is
called after and enslaves the devices:

# Option slaves deprecated, replaced by bond-slaves, but still supported
# for backward compatibility.
IF_BOND_SLAVES=${IF_BOND_SLAVES:-$IF_SLAVES}

if [ "$IF_BOND_MASTER" ] ; then
        BOND_MASTER="$IF_BOND_MASTER"
        BOND_SLAVES="$IFACE"
else
        if [ "$IF_BOND_SLAVES" ] ; then
                BOND_MASTER="$IFACE"
                BOND_SLAVES="$IF_BOND_SLAVES"
        fi
fi

# Exit if nothing to do...
[ -z "$BOND_MASTER$BOND_SLAVES" ] && exit

add_master
early_setup_master
setup_master
enslave_slaves
exit 0

-andy

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Veaceslav Falico Nov. 15, 2011, 9:04 p.m. UTC | #7
On Tue, Nov 15, 2011 at 08:24:29PM +0100, Nicolas de Pesloüan wrote:
> Le 15/11/2011 18:00, Andy Gospodarek a écrit :
> >On Tue, Nov 15, 2011 at 05:44:42PM +0100, Veaceslav Falico wrote:
> >>When changing mode via bonding's sysfs, the slaves are not initialized
> >>correctly. Forbid to change modes with slaves present to ensure that every
> >>slave is initialized correctly via bond_enslave().
> >>
> >>Signed-off-by: Veaceslav Falico<vfalico@redhat.com>
> >
> >Looks good.  This behavior forces someone who wants to change to mode to
> >go through steps that are almost as destructive as when module options
> >are used to configure the mode.  I do not see a problem with this.
> 
> Except the fact that is enforce one more constraint on the exact
> order one should write into sysfs to setup a bonding interface. We
> already have many such constraints and probably don't need more.
> 
> Currently, it is possible to enslave slaves before selecting the
> mode. The ifenslave-2.6 package from Debian currently enslave slaves
> before setting the mode and would break with this change.

Yes, it's possible, however the enslaved interfaces are initialized with
the current mode parameters, and when the mode is changed - they aren't
reinitialized at all. There are a lot of mode-specific initialization stuff
that's present only in bond_enslave(), and here are only some of the
(most obvious) snippets:

ALB/TLB balancing:
<snip>
        if (bond_is_lb(bond)) {
                /* bond_alb_init_slave() must be called before all other
 * stages since
                 * it might fail and we do not want to have to undo
                 * everything
                 */
                res = bond_alb_init_slave(bond, new_slave);
                if (res)
                        goto err_close;
        }    
</snip>

bond_alb_init_slave() is called only in this case. This means that the mac
address won't be changed at all, and some other stuff won't be properly
changed as well.

802.3ad:
<snip>
        if (bond->params.mode == BOND_MODE_8023AD) {
                /* add lacpdu mc addr to mc list */
                u8 lacpdu_multicast[ETH_ALEN] = MULTICAST_LACPDU_ADDR;

                dev_mc_add(slave_dev, lacpdu_multicast);
        }
</snip>

This means that the slave device will just drop all the LACPDUs.

So this means that at least two modes won't work if your first load the
bonding module with the default mode and then change it with slaves
attached. And I'm *really* sceptic on the other modes.

So even if the kernel doesn't show any error, it still doesn't work as
expected. To *really* fix this bug without adding any constraints would
require quite a few lines of code, and before it is fixed this patch is the
best way to avoid it.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas de Pesloüan Nov. 16, 2011, 12:02 p.m. UTC | #8
Le 15/11/2011 21:47, Andy Gospodarek a écrit :
> Nicolas,
>
> I took a look at the ifenslave package for debian more closely and it
> actually looks like devices are enslaved last, after mode is set.  Can
> you please take a look at this package and confirm what I'm seeing in
> the 'pre-up' script.
>
> It appears to me that setup_master sets the mode and enslave_slaves is
> called after and enslaves the devices:
>
> # Option slaves deprecated, replaced by bond-slaves, but still supported
> # for backward compatibility.
> IF_BOND_SLAVES=${IF_BOND_SLAVES:-$IF_SLAVES}
>
> if [ "$IF_BOND_MASTER" ] ; then
>          BOND_MASTER="$IF_BOND_MASTER"
>          BOND_SLAVES="$IFACE"
> else
>          if [ "$IF_BOND_SLAVES" ] ; then
>                  BOND_MASTER="$IFACE"
>                  BOND_SLAVES="$IF_BOND_SLAVES"
>          fi
> fi
>
> # Exit if nothing to do...
> [ -z "$BOND_MASTER$BOND_SLAVES" ]&&  exit
>
> add_master
> early_setup_master
> setup_master
> enslave_slaves
> exit 0

Andy,

I'm really surprise by this extract. In the most up to date version of the ifenslave-2.6 package 
(1.1.0-19), the order is:

add_master
early_setup_master
enslave_slaves
setup_master

early_setup_master was created to be able to do things that absolutely need to be done before 
enslavement. (See the comment just before this function). The idea was to do every possible setup in 
setup_master, after enslavement, except those that need to be done in early_setup_master. So having 
enslave_slaves after setup_master instead of before is definitely a mistake. Some of the operations 
in setup_master must be done after enslavement, in particular selecting the primary slave.

In version 1.1.0-18 (change log below), I checked all the possible order constraints of the sysfs 
interface and totally reworked most of the setup code, putting everything in the right order to 
achieve consistent results.

	ifenslave-2.6 (1.1.0-18) experimental; urgency=low

	  * Apply patch from Nicolas de Pesloüan:

	    - Major change: Check and fix the order in which the configuration is
	      written into /sys files, to ensure reliable results, according to the
	      tests done in the kernel (in drivers/net/bonding/bond_sysfs.c).
	    - Ensure that master is properly brought down when changing a parameter
	      that require it to be down.
	    - Ensure the master is brought up at the end of the setup, in the case
	      where ifup won't bring it up because it is currently configuring a slave.
	    - Add support for some previously unsupported /sys files: ad_select,
	      num_grat_arp, num_unsol_na, primary_reselect and queue_id.
	    - Enhance the documentation (README.Debian), to describe all the currently
	      supported bond-* options.
	    - Many other changes in the documentation.
	    - Reverse the order of the arguments to most sysfs_* internal functions, for
	      better readability.

It was a hard work, because there really exist many such constraints. I fail to find enough time to 
insert the result of this work into Documentation/networking/bonding.txt, but still plan to do so, 
even if the result is documented in the script you looked at.

Of course, it is possible to change the scripts in ifenslave-2.6 package to arrange for one more 
constraint. For as far as I understand, this would cause the Debian kernel that introduce the change 
we discuss about and all the future Debian kernels to be flagged with 'Breaks: ifenslave-2.6 (<< 
1.1.0-20)'. I'm not really comfortable with this and the Debian kernel team need to be involved. I 
copied them.

All that being said, I really advocate for less constraints on the sysfs setup. This is not strictly 
related to sysfs setup. If we eventually add a NETLINK interface for all the things we can setup 
using sysfs, we will face the exact same problem.

I perfectly understand, as Veaceslav noted in another email that there are a lot of mode-specific 
initialization stuff that's present only in bond_enslave(), but I think this is what needs to be 
fixed... Those initialization stuff should be moved out of bond_enslave() and called at appropriate 
sime, from bond_enslave() and from other locations, in particular when changing mode.

	Nicolas.




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Gospodarek Nov. 16, 2011, 10:02 p.m. UTC | #9
On Wed, Nov 16, 2011 at 01:02:21PM +0100, Nicolas de Pesloüan wrote:
> Le 15/11/2011 21:47, Andy Gospodarek a écrit :
>> Nicolas,
>>
>> I took a look at the ifenslave package for debian more closely and it
>> actually looks like devices are enslaved last, after mode is set.  Can
>> you please take a look at this package and confirm what I'm seeing in
>> the 'pre-up' script.
>>
>> It appears to me that setup_master sets the mode and enslave_slaves is
>> called after and enslaves the devices:
>>
>> # Option slaves deprecated, replaced by bond-slaves, but still supported
>> # for backward compatibility.
>> IF_BOND_SLAVES=${IF_BOND_SLAVES:-$IF_SLAVES}
>>
>> if [ "$IF_BOND_MASTER" ] ; then
>>          BOND_MASTER="$IF_BOND_MASTER"
>>          BOND_SLAVES="$IFACE"
>> else
>>          if [ "$IF_BOND_SLAVES" ] ; then
>>                  BOND_MASTER="$IFACE"
>>                  BOND_SLAVES="$IF_BOND_SLAVES"
>>          fi
>> fi
>>
>> # Exit if nothing to do...
>> [ -z "$BOND_MASTER$BOND_SLAVES" ]&&  exit
>>
>> add_master
>> early_setup_master
>> setup_master
>> enslave_slaves
>> exit 0
>
> Andy,
>
> I'm really surprise by this extract. In the most up to date version of 
> the ifenslave-2.6 package (1.1.0-19), the order is:
>
> add_master
> early_setup_master
> enslave_slaves
> setup_master
>
> early_setup_master was created to be able to do things that absolutely 
> need to be done before enslavement. (See the comment just before this 
> function). The idea was to do every possible setup in setup_master, after 
> enslavement, except those that need to be done in early_setup_master. So 
> having enslave_slaves after setup_master instead of before is definitely 
> a mistake. Some of the operations in setup_master must be done after 
> enslavement, in particular selecting the primary slave.
>
> In version 1.1.0-18 (change log below), I checked all the possible order 
> constraints of the sysfs interface and totally reworked most of the setup 
> code, putting everything in the right order to achieve consistent 
> results.
>
> 	ifenslave-2.6 (1.1.0-18) experimental; urgency=low
>
> 	  * Apply patch from Nicolas de Pesloüan:
>
> 	    - Major change: Check and fix the order in which the configuration is
> 	      written into /sys files, to ensure reliable results, according to the
> 	      tests done in the kernel (in drivers/net/bonding/bond_sysfs.c).
> 	    - Ensure that master is properly brought down when changing a parameter
> 	      that require it to be down.
> 	    - Ensure the master is brought up at the end of the setup, in the case
> 	      where ifup won't bring it up because it is currently configuring a slave.
> 	    - Add support for some previously unsupported /sys files: ad_select,
> 	      num_grat_arp, num_unsol_na, primary_reselect and queue_id.
> 	    - Enhance the documentation (README.Debian), to describe all the currently
> 	      supported bond-* options.
> 	    - Many other changes in the documentation.
> 	    - Reverse the order of the arguments to most sysfs_* internal functions, for
> 	      better readability.
>
> It was a hard work, because there really exist many such constraints. I 
> fail to find enough time to insert the result of this work into 
> Documentation/networking/bonding.txt, but still plan to do so, even if 
> the result is documented in the script you looked at.
>
> Of course, it is possible to change the scripts in ifenslave-2.6 package 
> to arrange for one more constraint. For as far as I understand, this 
> would cause the Debian kernel that introduce the change we discuss about 
> and all the future Debian kernels to be flagged with 'Breaks: 
> ifenslave-2.6 (<< 1.1.0-20)'. I'm not really comfortable with this and 
> the Debian kernel team need to be involved. I copied them.
>
> All that being said, I really advocate for less constraints on the sysfs 
> setup. This is not strictly related to sysfs setup. If we eventually add 
> a NETLINK interface for all the things we can setup using sysfs, we will 
> face the exact same problem.


I was looking at ifenslave 1.1.0-20.  If you look at Debian bug #641250
you will see a very similar report to what prompted Veaceslav to come up
with this patch and post it here.

ifenslave-2.6 (1.1.0-20) unstable; urgency=low

  * Use dashes consistently for bonding options in README.Debian.
    Closes: #639244
  * Enslave slaves only after fully setting up the master. Closes: #641250
  * Add build-arch and build-indep targets to debian/rules.

 -- Guus Sliepen <guus@debian.org>  Mon, 14 Nov 2011 11:36:21 +0100

ifenslave-2.6 (1.1.0-19) unstable; urgency=low

  * Don't bother trying to move configuration files anymore. This is not an
    issue anymore in for the next stable release, and it was broken anyway.
    Closes: #626959
  * Bump Standards-Version.

 -- Guus Sliepen <guus@debian.org>  Wed, 25 May 2011 18:42:32 +0200

ifenslave-2.6 (1.1.0-18) experimental; urgency=low

  * Apply patch from Nicolas de Pesloüan:

    - Major change: Check and fix the order in which the configuration is
      written into /sys files, to ensure reliable results, according to the
      tests done in the kernel (in drivers/net/bonding/bond_sysfs.c).
    - Ensure that master is properly brought down when changing a parameter
      that require it to be down.
    - Ensure the master is brought up at the end of the setup, in the case
      where ifup won't bring it up because it is currently configuring a slave.
    - Add support for some previously unsupported /sys files: ad_select,
      num_grat_arp, num_unsol_na, primary_reselect and queue_id.
    - Enhance the documentation (README.Debian), to describe all the currently
      supported bond-* options.
    - Many other changes in the documentation.
    - Reverse the order of the arguments to most sysfs_* internal functions, for
      better readability.

  * Upload to experimental due to the freeze.

 -- Guus Sliepen <guus@debian.org>  Tue, 21 Dec 2010 12:46:04 +0100

> I perfectly understand, as Veaceslav noted in another email that there 
> are a lot of mode-specific initialization stuff that's present only in 
> bond_enslave(), but I think this is what needs to be fixed... Those 
> initialization stuff should be moved out of bond_enslave() and called at 
> appropriate sime, from bond_enslave() and from other locations, in 
> particular when changing mode.
>

I think Veaceslav is working on this, but there is significant
re-organization that is needed to make it work properly and make sure it
is tested.  I could be wrong about how long it will take him, but to
test it properly it will take some time.

Since this problem seems like a pretty major problem and now Debian,
Fedora, RHEL, and Ubuntu all seem to have proper initialization scripts
to handle it, I stand behind my original ACK.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Flavio Leitner Nov. 17, 2011, 1:16 a.m. UTC | #10
On Wed, 16 Nov 2011 17:02:00 -0500
Andy Gospodarek <andy@greyhouse.net> wrote:

> On Wed, Nov 16, 2011 at 01:02:21PM +0100, Nicolas de Pesloüan wrote:
> > Le 15/11/2011 21:47, Andy Gospodarek a écrit :
> >> Nicolas,
> >>
> >> I took a look at the ifenslave package for debian more closely and
> >> it actually looks like devices are enslaved last, after mode is
> >> set.  Can you please take a look at this package and confirm what
> >> I'm seeing in the 'pre-up' script.
> >>
> >> It appears to me that setup_master sets the mode and
> >> enslave_slaves is called after and enslaves the devices:
> >>
> >> # Option slaves deprecated, replaced by bond-slaves, but still
> >> supported # for backward compatibility.
> >> IF_BOND_SLAVES=${IF_BOND_SLAVES:-$IF_SLAVES}
> >>
> >> if [ "$IF_BOND_MASTER" ] ; then
> >>          BOND_MASTER="$IF_BOND_MASTER"
> >>          BOND_SLAVES="$IFACE"
> >> else
> >>          if [ "$IF_BOND_SLAVES" ] ; then
> >>                  BOND_MASTER="$IFACE"
> >>                  BOND_SLAVES="$IF_BOND_SLAVES"
> >>          fi
> >> fi
> >>
> >> # Exit if nothing to do...
> >> [ -z "$BOND_MASTER$BOND_SLAVES" ]&&  exit
> >>
> >> add_master
> >> early_setup_master
> >> setup_master
> >> enslave_slaves
> >> exit 0
> >
> > Andy,
> >
> > I'm really surprise by this extract. In the most up to date version
> > of the ifenslave-2.6 package (1.1.0-19), the order is:
> >
> > add_master
> > early_setup_master
> > enslave_slaves
> > setup_master
> >
> > early_setup_master was created to be able to do things that
> > absolutely need to be done before enslavement. (See the comment
> > just before this function). The idea was to do every possible setup
> > in setup_master, after enslavement, except those that need to be
> > done in early_setup_master. So having enslave_slaves after
> > setup_master instead of before is definitely a mistake. Some of the
> > operations in setup_master must be done after enslavement, in
> > particular selecting the primary slave.
> >
> > In version 1.1.0-18 (change log below), I checked all the possible
> > order constraints of the sysfs interface and totally reworked most
> > of the setup code, putting everything in the right order to achieve
> > consistent results.
> >
> > 	ifenslave-2.6 (1.1.0-18) experimental; urgency=low
> >
> > 	  * Apply patch from Nicolas de Pesloüan:
> >
> > 	    - Major change: Check and fix the order in which the
> > configuration is written into /sys files, to ensure reliable
> > results, according to the tests done in the kernel (in
> > drivers/net/bonding/bond_sysfs.c).
> > 	    - Ensure that master is properly brought down when
> > changing a parameter that require it to be down.
> > 	    - Ensure the master is brought up at the end of the
> > setup, in the case where ifup won't bring it up because it is
> > currently configuring a slave.
> > 	    - Add support for some previously unsupported /sys
> > files: ad_select, num_grat_arp, num_unsol_na, primary_reselect and
> > queue_id.
> > 	    - Enhance the documentation (README.Debian), to
> > describe all the currently supported bond-* options.
> > 	    - Many other changes in the documentation.
> > 	    - Reverse the order of the arguments to most sysfs_*
> > internal functions, for better readability.
> >
> > It was a hard work, because there really exist many such
> > constraints. I fail to find enough time to insert the result of
> > this work into Documentation/networking/bonding.txt, but still plan
> > to do so, even if the result is documented in the script you looked
> > at.
> >
> > Of course, it is possible to change the scripts in ifenslave-2.6
> > package to arrange for one more constraint. For as far as I
> > understand, this would cause the Debian kernel that introduce the
> > change we discuss about and all the future Debian kernels to be
> > flagged with 'Breaks: ifenslave-2.6 (<< 1.1.0-20)'. I'm not really
> > comfortable with this and the Debian kernel team need to be
> > involved. I copied them.
> >
> > All that being said, I really advocate for less constraints on the
> > sysfs setup. This is not strictly related to sysfs setup. If we
> > eventually add a NETLINK interface for all the things we can setup
> > using sysfs, we will face the exact same problem.
> 
> 
> I was looking at ifenslave 1.1.0-20.  If you look at Debian bug
> #641250 you will see a very similar report to what prompted Veaceslav
> to come up with this patch and post it here.
> 
> ifenslave-2.6 (1.1.0-20) unstable; urgency=low
> 
>   * Use dashes consistently for bonding options in README.Debian.
>     Closes: #639244
>   * Enslave slaves only after fully setting up the master. Closes:
> #641250
> 
[...]

> > I perfectly understand, as Veaceslav noted in another email that
> > there are a lot of mode-specific initialization stuff that's
> > present only in bond_enslave(), but I think this is what needs to
> > be fixed... Those initialization stuff should be moved out of
> > bond_enslave() and called at appropriate sime, from bond_enslave()
> > and from other locations, in particular when changing mode.
> 
> I think Veaceslav is working on this, but there is significant
> re-organization that is needed to make it work properly and make sure
> it is tested.  I could be wrong about how long it will take him, but
> to test it properly it will take some time.

Indeed.

> Since this problem seems like a pretty major problem and now Debian,
> Fedora, RHEL, and Ubuntu all seem to have proper initialization
> scripts to handle it, I stand behind my original ACK.

I agree. I think allowing to change the mode after enslavement can
cause unpredictable issues and the follow-up patch will need some
careful work and testing, so you have my ACK here as well.

fbl
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Nov. 17, 2011, 9:04 p.m. UTC | #11
From: Veaceslav Falico <vfalico@redhat.com>
Date: Tue, 15 Nov 2011 17:44:42 +0100

> When changing mode via bonding's sysfs, the slaves are not initialized
> correctly. Forbid to change modes with slaves present to ensure that every
> slave is initialized correctly via bond_enslave().
> 
> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>

Lots of discussion... what is the final verdict on this patch bonding folks?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas de Pesloüan Nov. 17, 2011, 9:28 p.m. UTC | #12
Le 16/11/2011 23:02, Andy Gospodarek a écrit :

<snip>

> I was looking at ifenslave 1.1.0-20.  If you look at Debian bug #641250
> you will see a very similar report to what prompted Veaceslav to come up
> with this patch and post it here.

I completely missed this version. I recently reinstalled my system and forgot to add unstable to 
source.list.

> ifenslave-2.6 (1.1.0-20) unstable; urgency=low
>
>    * Use dashes consistently for bonding options in README.Debian.
>      Closes: #639244
>    * Enslave slaves only after fully setting up the master. Closes: #641250
>    * Add build-arch and build-indep targets to debian/rules.
>
>   -- Guus Sliepen<guus@debian.org>   Mon, 14 Nov 2011 11:36:21 +0100

Having a look at the change made to fix the bug described in #641250, I anticipate it will cause 
some regressions because some of the actions taken in setup_master must be done after enslavement 
and are now done before:

- primary must be set after mode (because only supported in some modes) and after enslavement.
- primary_reselect should be set after mode (because only supported in some modes), after 
enslavement and after primary.
- queue_id must be set after enslavement.
- active_slave must be set after mode and after enslavement.

I will prepare a -21 version.

<snip>

> Since this problem seems like a pretty major problem and now Debian,
> Fedora, RHEL, and Ubuntu all seem to have proper initialization scripts
> to handle it, I stand behind my original ACK.

You are right.

Acked-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>

	Nicolas.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas de Pesloüan Nov. 17, 2011, 10:36 p.m. UTC | #13
Le 17/11/2011 22:04, David Miller a écrit :
> From: Veaceslav Falico<vfalico@redhat.com>
> Date: Tue, 15 Nov 2011 17:44:42 +0100
>
>> When changing mode via bonding's sysfs, the slaves are not initialized
>> correctly. Forbid to change modes with slaves present to ensure that every
>> slave is initialized correctly via bond_enslave().
>>
>> Signed-off-by: Veaceslav Falico<vfalico@redhat.com>
>
> Lots of discussion... what is the final verdict on this patch bonding folks?

Now looks good to me, and as all others participants agreed at the beginning, I think it's good for 
everyone.

	Nicolas.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Nov. 18, 2011, 12:32 a.m. UTC | #14
From: Nicolas de Pesloüan <nicolas.2p.debian@gmail.com>
Date: Thu, 17 Nov 2011 23:36:57 +0100

> Le 17/11/2011 22:04, David Miller a écrit :
>> From: Veaceslav Falico<vfalico@redhat.com>
>> Date: Tue, 15 Nov 2011 17:44:42 +0100
>>
>>> When changing mode via bonding's sysfs, the slaves are not initialized
>>> correctly. Forbid to change modes with slaves present to ensure that
>>> every
>>> slave is initialized correctly via bond_enslave().
>>>
>>> Signed-off-by: Veaceslav Falico<vfalico@redhat.com>
>>
>> Lots of discussion... what is the final verdict on this patch bonding
>> folks?
> 
> Now looks good to me, and as all others participants agreed at the
> beginning, I think it's good for everyone.

Great, applied, thanks everyone.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 5a20804..4ef7e2f 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -319,6 +319,13 @@  static ssize_t bonding_store_mode(struct device *d,
 		goto out;
 	}
 
+	if (bond->slave_cnt > 0) {
+		pr_err("unable to update mode of %s because it has slaves.\n",
+			bond->dev->name);
+		ret = -EPERM;
+		goto out;
+	}
+
 	new_value = bond_parse_parm(buf, bond_mode_tbl);
 	if (new_value < 0)  {
 		pr_err("%s: Ignoring invalid mode value %.*s.\n",