Message ID | 20160808214812.GJ22974@cork |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On 2016/8/9 5:48, Jörn Engel wrote: > Up until 00503b6f702e (part of 3.14-rc1), the bonding driver could be > used to enslave tun-interfaces. 00503b6f702e broke that behaviour, > afaics as an unintended side-effect. > Hi Jorn: I don't understand your problem clearly, can you explain more about how the 00503b6f702e break tun-interfaces and we will try to fix it. and more, dev_set_mac_address will change the salver's mac address, some nic don't support to change the mac address and could not work as bond slave, so we need to check the return value, I don't think this patch has any effective improvement. Thanks. Ding > For the purpose of bond-over-tun in balance-rr mode, simply ignoring the > error from dev_set_mac_address() is good enough. I am not familiar > enough with the code to judge what new problems this patch might > introduce. > > Signed-off-by: Joern Engel <joern@purestorage.com> > --- > drivers/net/bonding/bond_main.c | 7 +------ > 1 file changed, 1 insertion(+), 6 deletions(-) > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index 1f276fa30ba6..bc5dba847f50 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -1489,11 +1489,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) > */ > memcpy(addr.sa_data, bond_dev->dev_addr, bond_dev->addr_len); > addr.sa_family = slave_dev->type; > - res = dev_set_mac_address(slave_dev, &addr); > - if (res) { > - netdev_dbg(bond_dev, "Error %d calling set_mac_address\n", res); > - goto err_restore_mtu; > - } > + dev_set_mac_address(slave_dev, &addr); > } > > /* set slave flag before open to prevent IPv6 addrconf */ > @@ -1777,7 +1773,6 @@ err_restore_mac: > dev_set_mac_address(slave_dev, &addr); > } > > -err_restore_mtu: > dev_set_mtu(slave_dev, new_slave->original_mtu); > > err_free: >
Hello Tianhong! On Tue, Aug 09, 2016 at 10:18:41AM +0800, Ding Tianhong wrote: > > I don't understand your problem clearly, can you explain more about how the 00503b6f702e break tun-interfaces > and we will try to fix it. Here is a trivial testcase: openvpn --mktun --dev tun0 echo +tun0 > /sys/class/net/bond0/bonding/slaves Worked fine before your patch, no longer works after your patch. Works again after my patch. > and more, dev_set_mac_address will change the salver's mac address, some nic don't support to change the mac address and > could not work as bond slave, so we need to check the return value, I don't think this patch has any effective improvement. Using bonding in balance-rr mode, there doesn't seem to be a need to change the mac address. I suppose you might care in other modes, but I don't. Jörn -- Time? What's that? Time is only worth what you do with it. -- Theo de Raadt
Can we check slave_ops->ndo_set_mac_address? 1476 if ((slave_ops->ndo_set_mac_address) && (!bond->params.fail_over_mac || 1477 BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP)) { 1478 /* Set slave to master's mac address. The application already 1479 * set the master's mac address to that of the first slave 1480 */ 1481 memcpy(addr.sa_data, bond_dev->dev_addr, bond_dev->addr_len); 1482 addr.sa_family = slave_dev->type; 1483 res = dev_set_mac_address(slave_dev, &addr); 1484 if (res) { 1485 netdev_dbg(bond_dev, "Error %d calling set_mac_address\n", res); 1486 goto err_restore_mtu; 1487 } 1488 } On Tue, Aug 9, 2016 at 11:09 AM, Jörn Engel <joern@purestorage.com> wrote: > Hello Tianhong! > > On Tue, Aug 09, 2016 at 10:18:41AM +0800, Ding Tianhong wrote: >> >> I don't understand your problem clearly, can you explain more about how the 00503b6f702e break tun-interfaces >> and we will try to fix it. > > Here is a trivial testcase: > openvpn --mktun --dev tun0 > echo +tun0 > /sys/class/net/bond0/bonding/slaves > > Worked fine before your patch, no longer works after your patch. Works > again after my patch. > >> and more, dev_set_mac_address will change the salver's mac address, some nic don't support to change the mac address and >> could not work as bond slave, so we need to check the return value, I don't think this patch has any effective improvement. > > Using bonding in balance-rr mode, there doesn't seem to be a need to > change the mac address. I suppose you might care in other modes, but I > don't. > > Jörn > > -- > Time? What's that? Time is only worth what you do with it. > -- Theo de Raadt
On 2016/8/9 11:09, Jörn Engel wrote: > Hello Tianhong! > > On Tue, Aug 09, 2016 at 10:18:41AM +0800, Ding Tianhong wrote: >> >> I don't understand your problem clearly, can you explain more about how the 00503b6f702e break tun-interfaces >> and we will try to fix it. > > Here is a trivial testcase: > openvpn --mktun --dev tun0 > echo +tun0 > /sys/class/net/bond0/bonding/slaves > > Worked fine before your patch, no longer works after your patch. Works > again after my patch. > Hi Jorn: I check the code and know the reason, the Tun device(or something like tun...) didn't has the ndo_set_mac_address, so will not add to bond as a slaver device in the mode rr, I think the original logic is fine for bond enslave processing, the old kernel just avoid this problem and not fix it, the bond need to considerate the virtual device which not support changing mac. >> and more, dev_set_mac_address will change the salver's mac address, some nic don't support to change the mac address and >> could not work as bond slave, so we need to check the return value, I don't think this patch has any effective improvement. > > Using bonding in balance-rr mode, there doesn't seem to be a need to > change the mac address. I suppose you might care in other modes, but I > don't. > > Jörn > > -- > Time? What's that? Time is only worth what you do with it. > -- Theo de Raadt > >
On 2016/8/9 13:29, zhuyj wrote: > Can we check slave_ops->ndo_set_mac_address? > > 1476 if ((slave_ops->ndo_set_mac_address) && > (!bond->params.fail_over_mac || > 1477 BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP)) { > 1478 /* Set slave to master's mac address. The > application already > 1479 * set the master's mac address to that of the first slave > 1480 */ > 1481 memcpy(addr.sa_data, bond_dev->dev_addr, > bond_dev->addr_len); > 1482 addr.sa_family = slave_dev->type; > 1483 res = dev_set_mac_address(slave_dev, &addr); > 1484 if (res) { > 1485 netdev_dbg(bond_dev, "Error %d calling > set_mac_address\n", res); > 1486 goto err_restore_mtu; > 1487 } > 1488 } > This patch is a simple solution for this problem, but I don't think it is the right solution, the bond is a virtual device base on L2, if the slave has no mac address, it will break the design principle, so we need to think more about it. I think if the bonding dev has to support L3 virtual device, we need to add new bond features to distinguish the dev and make the bond xmit and transfer without the mac address. Thanks Ding > On Tue, Aug 9, 2016 at 11:09 AM, Jörn Engel <joern@purestorage.com> wrote: >> Hello Tianhong! >> >> On Tue, Aug 09, 2016 at 10:18:41AM +0800, Ding Tianhong wrote: >>> >>> I don't understand your problem clearly, can you explain more about how the 00503b6f702e break tun-interfaces >>> and we will try to fix it. >> >> Here is a trivial testcase: >> openvpn --mktun --dev tun0 >> echo +tun0 > /sys/class/net/bond0/bonding/slaves >> >> Worked fine before your patch, no longer works after your patch. Works >> again after my patch. >> >>> and more, dev_set_mac_address will change the salver's mac address, some nic don't support to change the mac address and >>> could not work as bond slave, so we need to check the return value, I don't think this patch has any effective improvement. >> >> Using bonding in balance-rr mode, there doesn't seem to be a need to >> change the mac address. I suppose you might care in other modes, but I >> don't. >> >> Jörn >> >> -- >> Time? What's that? Time is only worth what you do with it. >> -- Theo de Raadt > > . >
On Tue, Aug 09, 2016 at 09:28:45PM +0800, Ding Tianhong wrote: > > This patch is a simple solution for this problem, but I don't think it is the right solution, the bond is a virtual device base on L2, > if the slave has no mac address, it will break the design principle, so we need to think more about it. The important point is: it worked. It solved a problem that at least three people cared enough about to send a bug report. Now it doesn't work anymore. That is a regression. Whether or not L2 has always been a design principle for bonding can be argued as well. But in the face of a regression, I suggest we fix the regression. > I think if the bonding dev has to support L3 virtual device, we need to add new bond features to distinguish the dev and make the > bond xmit and transfer without the mac address. Simply not checking errors when setting the mac address solves the problem for me. No new features needed. If you want to retain error handling, you can make those checks conditional on the mode. In balance-rr or broadcast mode, ignore the error. I don't need and haven't tested broadcast mode, but it doesn't seem to depend on any L2 attributes either. Jörn -- Do not stop an army on its way home. -- Sun Tzu
Jörn Engel <joern@purestorage.com> wrote: >Hello Tianhong! > >On Tue, Aug 09, 2016 at 10:18:41AM +0800, Ding Tianhong wrote: >> >> I don't understand your problem clearly, can you explain more about how the 00503b6f702e break tun-interfaces >> and we will try to fix it. > >Here is a trivial testcase: >openvpn --mktun --dev tun0 >echo +tun0 > /sys/class/net/bond0/bonding/slaves > >Worked fine before your patch, no longer works after your patch. Works >again after my patch. Could you describe your use case a bit further? Are you bonding together multiple VPN tunnels? This may be a regression, but since the patch that nominally introduced it was 2 years ago, the impact appears to be very narrow. >> and more, dev_set_mac_address will change the salver's mac address, some nic don't support to change the mac address and >> could not work as bond slave, so we need to check the return value, I don't think this patch has any effective improvement. > >Using bonding in balance-rr mode, there doesn't seem to be a need to >change the mac address. I suppose you might care in other modes, but I >don't. The balance-rr mode (as well as the -xor mode) is designed to interoperate with a Cisco Etherchannel-style static link aggregation, which requires all members to have the same MAC address for proper function. Now, the above notwithstanding, I don't have an issue if you want to bond together a couple of tun devices and can make it work. However, for the standard balance-rr case, the enslavement must fail if the call to set the slave MAC fails, as permitting the slave into the bond after set-MAC fails can result in a bond that silently loses packets. My tentative suggestion is that we extened fail_over_mac to cover additional modes, as a sort of "I really know what I'm doing" flag, and allow the enslavement to succeed when it is set. This would require setting an additional bonding option for this situation, but that doesn't seem to be a undue burden as this looks to be a niche use case. -J --- -Jay Vosburgh, jay.vosburgh@canonical.com
On Tue, Aug 09, 2016 at 11:21:31AM -0700, Jay Vosburgh wrote: > Jörn Engel <joern@purestorage.com> wrote: > >On Tue, Aug 09, 2016 at 10:18:41AM +0800, Ding Tianhong wrote: > >> > >> I don't understand your problem clearly, can you explain more about how the 00503b6f702e break tun-interfaces > >> and we will try to fix it. > > > >Here is a trivial testcase: > >openvpn --mktun --dev tun0 > >echo +tun0 > /sys/class/net/bond0/bonding/slaves > > > >Worked fine before your patch, no longer works after your patch. Works > >again after my patch. > > Could you describe your use case a bit further? Are you bonding > together multiple VPN tunnels? Yes. Specificaly I use "ssh -w" to create tunnels. Ssh is single-threaded, so the tunnel is too slow. Aggregate a bunch and you get closer to link speed. Alternative would be pfSense. Afaics that easily beats anything Linux can offer. I'm just more familiar with Linux and trust ssh security more than most alternatives. > This may be a regression, but since the patch that nominally > introduced it was 2 years ago, the impact appears to be very narrow. Did you check the dates on the other two bug reports? Anyone experiencing the problem and checking google will come to the conclusion that you don't care and not bother sending yet another bug report. You then come to the conclusion that users don't care. > >> and more, dev_set_mac_address will change the salver's mac address, some nic don't support to change the mac address and > >> could not work as bond slave, so we need to check the return value, I don't think this patch has any effective improvement. > > > >Using bonding in balance-rr mode, there doesn't seem to be a need to > >change the mac address. I suppose you might care in other modes, but I > >don't. > > The balance-rr mode (as well as the -xor mode) is designed to > interoperate with a Cisco Etherchannel-style static link aggregation, > which requires all members to have the same MAC address for proper > function. Linux was designed to be a terminal for dialup to a university in Helsinki, if memory serves. Sometimes it is a good thing to work in ways the design never intended. Jörn -- A defeated army first battles and then seeks victory. -- Sun Tzu
From: Jörn Engel <joern@purestorage.com> Date: Tue, 9 Aug 2016 11:08:30 -0700 > On Tue, Aug 09, 2016 at 09:28:45PM +0800, Ding Tianhong wrote: >> >> I think if the bonding dev has to support L3 virtual device, we need to add new bond features to distinguish the dev and make the >> bond xmit and transfer without the mac address. > > Simply not checking errors when setting the mac address solves the > problem for me. No new features needed. But it only works in certain modes. So the best we can do is enforce the MAC address setting in the modes that absolutely require it. We cannot ignore the MAC address setting unilaterally.
From: Jörn Engel <joern@purestorage.com> Date: Tue, 9 Aug 2016 11:40:57 -0700 > On Tue, Aug 09, 2016 at 11:21:31AM -0700, Jay Vosburgh wrote: >> The balance-rr mode (as well as the -xor mode) is designed to >> interoperate with a Cisco Etherchannel-style static link aggregation, >> which requires all members to have the same MAC address for proper >> function. > > Linux was designed to be a terminal for dialup to a university in > Helsinki, if memory serves. Sometimes it is a good thing to work in > ways the design never intended. You're not addressing the issue Jay is trying to make you aware of in a useful way. You state that Jay doesn't want to help you, but your comment here shows that you really aren't exactly participating in the most positive manner either.
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 1f276fa30ba6..bc5dba847f50 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1489,11 +1489,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev) */ memcpy(addr.sa_data, bond_dev->dev_addr, bond_dev->addr_len); addr.sa_family = slave_dev->type; - res = dev_set_mac_address(slave_dev, &addr); - if (res) { - netdev_dbg(bond_dev, "Error %d calling set_mac_address\n", res); - goto err_restore_mtu; - } + dev_set_mac_address(slave_dev, &addr); } /* set slave flag before open to prevent IPv6 addrconf */ @@ -1777,7 +1773,6 @@ err_restore_mac: dev_set_mac_address(slave_dev, &addr); } -err_restore_mtu: dev_set_mtu(slave_dev, new_slave->original_mtu); err_free:
Up until 00503b6f702e (part of 3.14-rc1), the bonding driver could be used to enslave tun-interfaces. 00503b6f702e broke that behaviour, afaics as an unintended side-effect. For the purpose of bond-over-tun in balance-rr mode, simply ignoring the error from dev_set_mac_address() is good enough. I am not familiar enough with the code to judge what new problems this patch might introduce. Signed-off-by: Joern Engel <joern@purestorage.com> --- drivers/net/bonding/bond_main.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-)