Message ID | 1375461665-4186-1-git-send-email-nikolay@redhat.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
From: Nikolay Aleksandrov <nikolay@redhat.com> Date: Fri, 2 Aug 2013 18:41:05 +0200 > From: Nikolay Aleksandrov <Nikolay Aleksandrov nikolay@redhat.com> > > When arp monitoring is enabled the bonding relies on slaves' > dev_trans_start() value to check if the slave link is up or not, but for > 8021q devices that value is either stale or 0, and can't be used. So use > the 8021q's underlying device value. > > Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Handling this specially in bonding isn't really ideal. Please either hide this detail in dev_trans_start(), or (preferrably) have vlan_dev_hard_start_xmit() set the trans_start timestamp properly thus making this just work for everything. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/03/2013 12:32 AM, David Miller wrote: > From: Nikolay Aleksandrov <nikolay@redhat.com> > Date: Fri, 2 Aug 2013 18:41:05 +0200 > >> From: Nikolay Aleksandrov <Nikolay Aleksandrov nikolay@redhat.com> >> >> When arp monitoring is enabled the bonding relies on slaves' >> dev_trans_start() value to check if the slave link is up or not, but for >> 8021q devices that value is either stale or 0, and can't be used. So use >> the 8021q's underlying device value. >> >> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> > > Handling this specially in bonding isn't really ideal. > Indeed, it's not really a bonding problem. > Please either hide this detail in dev_trans_start(), or (preferrably) > have vlan_dev_hard_start_xmit() set the trans_start timestamp > properly thus making this just work for everything. > > Thanks. > Yep, I prefer option 2 as well. I will submit a patch tomorrow, thank you for the suggestions. Nik -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2013-08-02 at 15:32 -0700, David Miller wrote: > Handling this specially in bonding isn't really ideal. > > Please either hide this detail in dev_trans_start(), or (preferrably) > have vlan_dev_hard_start_xmit() set the trans_start timestamp > properly thus making this just work for everything. vlan is LLTX, so setting the timestamp would incur false sharing on multiqueue. But it's true we need a helper, because many callers do if (dev->priv_flags & IFF_802_1Q_VLAN) dev = vlan_dev_real_dev(dev); or the slighly better if (is_vlan_dev(dev)) dev = vlan_dev_real_dev(dev); -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/03/2013 01:17 AM, Eric Dumazet wrote: > On Fri, 2013-08-02 at 15:32 -0700, David Miller wrote: > >> Handling this specially in bonding isn't really ideal. >> >> Please either hide this detail in dev_trans_start(), or (preferrably) >> have vlan_dev_hard_start_xmit() set the trans_start timestamp >> properly thus making this just work for everything. > > vlan is LLTX, so setting the timestamp would incur false sharing on > multiqueue. > Yeah, this statement actually explains a lot for me, thanks :-) I knew it was because of the LLTX, but I was wondering about the possible reasons for the xmit_lock_owner check. So basically the arp monitoring (or any dev_trans_start code) won't work with LLTX devices because they don't get their trans_start updated (not the txq trans_start nor the dev->trans_start), is this correct ? But what if the txqs get bound to a particular CPU, then the txq trans_start is okay to be updated I suppose. Nik > But it's true we need a helper, because many callers do > > if (dev->priv_flags & IFF_802_1Q_VLAN) > dev = vlan_dev_real_dev(dev); > > or the slighly better > > if (is_vlan_dev(dev)) > dev = vlan_dev_real_dev(dev); > > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Nikolay Aleksandrov <nikolay@redhat.com> wrote: >On 08/03/2013 01:17 AM, Eric Dumazet wrote: >> On Fri, 2013-08-02 at 15:32 -0700, David Miller wrote: >> >>> Handling this specially in bonding isn't really ideal. >>> >>> Please either hide this detail in dev_trans_start(), or (preferrably) >>> have vlan_dev_hard_start_xmit() set the trans_start timestamp >>> properly thus making this just work for everything. >> >> vlan is LLTX, so setting the timestamp would incur false sharing on >> multiqueue. >> >Yeah, this statement actually explains a lot for me, thanks :-) >I knew it was because of the LLTX, but I was wondering about the possible >reasons for the xmit_lock_owner check. >So basically the arp monitoring (or any dev_trans_start code) won't work >with LLTX devices because they don't get their trans_start updated (not the >txq trans_start nor the dev->trans_start), is this correct ? I think what Eric means is that it'll be a performance problem, because the cache line with the trans_start field will be thrashed between CPUs as updates compete with each other or inspection from dev_trans_start. I suspect it will work from a functional point of view (unless I'm missing something). As a practical matter, it looks like bonding is at present the only caller of dev_trans_start that passes either (a) a VLAN device, or (b) a device other than itself (most drivers seem to use it for transmit timeout detection on their own device). Having software devices set trans_start seems unnecessary (bonding doesn't set it, either), since they'll generally have a hardware device underneath with a real trans_start value. It might be hard to hunt down for tun, though. >But what if the txqs get bound to a particular CPU, then the txq >trans_start is okay to be updated I suppose. > >Nik >> But it's true we need a helper, because many callers do >> >> if (dev->priv_flags & IFF_802_1Q_VLAN) >> dev = vlan_dev_real_dev(dev); >> >> or the slighly better >> >> if (is_vlan_dev(dev)) >> dev = vlan_dev_real_dev(dev); For just the bonding dev_trans_start() for a VLAN case, one of the above in dev_trans_start() ought to be sufficient. -J --- -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 2013-08-03 at 01:29 +0200, Nikolay Aleksandrov wrote: > I knew it was because of the LLTX, but I was wondering about the possible > reasons for the xmit_lock_owner check. > So basically the arp monitoring (or any dev_trans_start code) won't work > with LLTX devices because they don't get their trans_start updated (not the > txq trans_start nor the dev->trans_start), is this correct ? Nope : LLTX devices are supposed to update their own dev->trans_start I added for these case a special comment as in : drivers/net/ethernet/atheros/atl1e/atl1e_main.c:1883: netdev->trans_start = jiffies; /* NETIF_F_LLTX driver :( */ trans_start is a txq property, and vlan devices have a single txq. Updating the vlandev->trans_start or the vlantxq->trans_start would be a performance killer on MQ ethernet. And frankly, as this trans_start is really seldom queried, it makes no sense to set it on fast path on the vlan device, if its properly done on the real device anyway. > But what if the txqs get bound to a particular CPU, then the txq > trans_start is okay to be updated I suppose. Not sure what you are saying. vlan xmit can be called from any cpus. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/03/2013 02:03 AM, Eric Dumazet wrote: > On Sat, 2013-08-03 at 01:29 +0200, Nikolay Aleksandrov wrote: > >> I knew it was because of the LLTX, but I was wondering about the possible >> reasons for the xmit_lock_owner check. >> So basically the arp monitoring (or any dev_trans_start code) won't work >> with LLTX devices because they don't get their trans_start updated (not the >> txq trans_start nor the dev->trans_start), is this correct ? > > Nope : LLTX devices are supposed to update their own dev->trans_start > I added for these case a special comment as in : > > drivers/net/ethernet/atheros/atl1e/atl1e_main.c:1883: netdev->trans_start = jiffies; /* NETIF_F_LLTX driver :( */ > Ah, didn't know about that, so there could be an LLTX device with proper trans_start if it updates it itself. Fair enough. > trans_start is a txq property, and vlan devices have a single txq. > > Updating the vlandev->trans_start or the vlantxq->trans_start would be a > performance killer on MQ ethernet. > Yeah, that part I got, that's why I said it explains a lot for me earlier :-) > And frankly, as this trans_start is really seldom queried, it makes no > sense to set it on fast path on the vlan device, if its properly done on > the real device anyway. > >> But what if the txqs get bound to a particular CPU, then the txq >> trans_start is okay to be updated I suppose. > > Not sure what you are saying. vlan xmit can be called from any cpus. > Never mind, I didn't notice that the 8021q dev has a single txq as you said. Thanks for taking the time to explain all this. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 07f257d4..5f2bad3 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -665,6 +665,17 @@ static int bond_check_dev_link(struct bonding *bond, return reporting ? -1 : BMSR_LSTATUS; } +/* For 8021q devices dev_trans_start() returns 0 or a stale value, so use the + * 8021q's underlying device value + */ +static unsigned long bond_dev_trans_start(struct net_device *dev) +{ + while (dev->priv_flags & IFF_802_1Q_VLAN) + dev = vlan_dev_real_dev(dev); + + return dev_trans_start(dev); +} + /*----------------------------- Multicast list ------------------------------*/ /* @@ -2750,7 +2761,7 @@ void bond_loadbalance_arp_mon(struct work_struct *work) * so it can wait */ bond_for_each_slave(bond, slave, i) { - unsigned long trans_start = dev_trans_start(slave->dev); + unsigned long trans_start = bond_dev_trans_start(slave->dev); if (slave->link != BOND_LINK_UP) { if (time_in_range(jiffies, @@ -2912,7 +2923,7 @@ static int bond_ab_arp_inspect(struct bonding *bond, int delta_in_ticks) * - (more than 2*delta since receive AND * the bond has an IP address) */ - trans_start = dev_trans_start(slave->dev); + trans_start = bond_dev_trans_start(slave->dev); if (bond_is_active_slave(slave) && (!time_in_range(jiffies, trans_start - delta_in_ticks, @@ -2947,7 +2958,7 @@ static void bond_ab_arp_commit(struct bonding *bond, int delta_in_ticks) continue; case BOND_LINK_UP: - trans_start = dev_trans_start(slave->dev); + trans_start = bond_dev_trans_start(slave->dev); if ((!bond->curr_active_slave && time_in_range(jiffies, trans_start - delta_in_ticks,