Patchwork [net-next] bonding: don't call alb_set_slave_mac_addr() while atomic

login
register
mail settings
Submitter Veaceslav Falico
Date June 16, 2013, 5:20 p.m.
Message ID <1371403244-2891-1-git-send-email-vfalico@redhat.com>
Download mbox | patch
Permalink /patch/251728/
State Superseded
Delegated to: David Miller
Headers show

Comments

Veaceslav Falico - June 16, 2013, 5:20 p.m.
alb_set_slave_mac_addr() sets the mac address in alb mode via
dev_set_mac_address(), which might sleep. It's called from
alb_handle_addr_collision_on_attach() in atomic context (under
read_lock(bond->lock)), thus triggering a bug.

Fix this by moving the lock inside alb_handle_addr_collision_on_attach().

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bond_alb.c |   21 ++++++++++-----------
 1 files changed, 10 insertions(+), 11 deletions(-)
Nikolay Aleksandrov - June 17, 2013, 10:30 a.m.
On 06/16/2013 07:20 PM, Veaceslav Falico wrote:
> alb_set_slave_mac_addr() sets the mac address in alb mode via
> dev_set_mac_address(), which might sleep. It's called from
> alb_handle_addr_collision_on_attach() in atomic context (under
> read_lock(bond->lock)), thus triggering a bug.
> 
> Fix this by moving the lock inside alb_handle_addr_collision_on_attach().
> 
> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>

Hello,
I have an idea about this function, since the
alb_handle_addr_collision_on_attach function needs to check if the slave's mac
address is unique and if it's not it tries to find an address from the other
slaves' permanent addresses. Instead of doing this, my proposition is:
1. this function and the only caller are running always inside RTNL, so I don't
think we need the read_lock at all, there can't be slave manipulation or MAC
address change during that period (if I'm not missing something).
2. the collision handling function can instead always succeed:
  - first walk over the slave list and check if there's a collision and
    also if any of the slaves has bond's MAC address, if there's no collision
    just return
  - if there's a collision:
   - if bond's address is not in use -> set it to the slave and return
   - else set a random MAC to the slave (eth_hw_addr_random) and return
 (and if we simplify it even further in the collision case we can just set a
random MAC always)
This way the code simplifies very nice and we always get a unique slave's MAC.
I've tried this and IMO it looks good.
What do you think ?

Cheers,
 Nik
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Veaceslav Falico - June 17, 2013, 2:12 p.m.
On Mon, Jun 17, 2013 at 12:30:54PM +0200, Nikolay Aleksandrov wrote:
>On 06/16/2013 07:20 PM, Veaceslav Falico wrote:
>> alb_set_slave_mac_addr() sets the mac address in alb mode via
>> dev_set_mac_address(), which might sleep. It's called from
>> alb_handle_addr_collision_on_attach() in atomic context (under
>> read_lock(bond->lock)), thus triggering a bug.
>>
>> Fix this by moving the lock inside alb_handle_addr_collision_on_attach().
>>
>> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
>
>Hello,
>I have an idea about this function, since the
>alb_handle_addr_collision_on_attach function needs to check if the slave's mac
>address is unique and if it's not it tries to find an address from the other
>slaves' permanent addresses. Instead of doing this, my proposition is:
>1. this function and the only caller are running always inside RTNL, so I don't
>think we need the read_lock at all, there can't be slave manipulation or MAC
>address change during that period (if I'm not missing something).

Yep, I've thought about dropping the lock completely initially, cause
indeed mac address can't change out of rtnl_lock (and anyway it would be
wrong now to drop it in between if it would be otherwise).

My concern was with bond_for_each_slave(), cause it relies on not fiddling
with slaves, and I've decided to play safe - it's really not a hot path
here.

However, I think you're right on that. We manipulate slave list basically
only in bond_attach_slave() and bond_detach_slave(), to which we get either
by sysfs (rtnl_lock() is already held, if we don't have more bugs there,
hopefully...) and ioctl/compat_ioctl, which both get to dev_ioctl() and get
the lock there. Oh, and on init/uninit, but it's also protected by
rtnl_lock() there.

If that's true I think we can benefit from it (read: drop bond->lock) in
quite a few locations, though I didn't dig it. Please correct me if I've
missed something.

I'll try now to remove the bond->lock and test, will update with V2 if it
works out, or will write my findings here.

>2. the collision handling function can instead always succeed:
>  - first walk over the slave list and check if there's a collision and
>    also if any of the slaves has bond's MAC address, if there's no collision
>    just return
>  - if there's a collision:
>   - if bond's address is not in use -> set it to the slave and return
>   - else set a random MAC to the slave (eth_hw_addr_random) and return
> (and if we simplify it even further in the collision case we can just set a
>random MAC always)
>This way the code simplifies very nice and we always get a unique slave's MAC.
>I've tried this and IMO it looks good.
>What do you think ?

I'm really not sure about the "always succeed" part - if bond's using our
address and there's no free address left - it must be some kind of bug and
must be fixed, instead of just adding a random mac and going on.

>
>Cheers,
> Nik
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index a236234..3f14f99 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1175,10 +1175,7 @@  static void alb_change_hw_addr_on_detach(struct bonding *bond, struct slave *sla
  * @slave.
  *
  * assumption: this function is called before @slave is attached to the
- * 	       bond slave list.
- *
- * caller must hold the bond lock for write since the mac addresses are compared
- * and may be swapped.
+ *	       bond slave list.
  */
 static int alb_handle_addr_collision_on_attach(struct bonding *bond, struct slave *slave)
 {
@@ -1196,6 +1193,9 @@  static int alb_handle_addr_collision_on_attach(struct bonding *bond, struct slav
 	 * slaves in the bond.
 	 */
 	if (!ether_addr_equal_64bits(slave->perm_hwaddr, bond->dev->dev_addr)) {
+
+		read_lock(&bond->lock);
+
 		bond_for_each_slave(bond, tmp_slave1, i) {
 			if (ether_addr_equal_64bits(tmp_slave1->dev->dev_addr,
 						    slave->dev->dev_addr)) {
@@ -1204,6 +1204,8 @@  static int alb_handle_addr_collision_on_attach(struct bonding *bond, struct slav
 			}
 		}
 
+		read_unlock(&bond->lock);
+
 		if (!found)
 			return 0;
 
@@ -1217,6 +1219,8 @@  static int alb_handle_addr_collision_on_attach(struct bonding *bond, struct slav
 	 */
 	free_mac_slave = NULL;
 
+	read_lock(&bond->lock);
+
 	bond_for_each_slave(bond, tmp_slave1, i) {
 		found = 0;
 		bond_for_each_slave(bond, tmp_slave2, j) {
@@ -1244,6 +1248,8 @@  static int alb_handle_addr_collision_on_attach(struct bonding *bond, struct slav
 		}
 	}
 
+	read_unlock(&bond->lock);
+
 	if (free_mac_slave) {
 		alb_set_slave_mac_addr(slave, free_mac_slave->perm_hwaddr);
 
@@ -1607,15 +1613,8 @@  int bond_alb_init_slave(struct bonding *bond, struct slave *slave)
 		return res;
 	}
 
-	/* caller must hold the bond lock for write since the mac addresses
-	 * are compared and may be swapped.
-	 */
-	read_lock(&bond->lock);
-
 	res = alb_handle_addr_collision_on_attach(bond, slave);
 
-	read_unlock(&bond->lock);
-
 	if (res) {
 		return res;
 	}