Patchwork net: tx scalability works : trans_start

login
register
mail settings
Submitter Eric Dumazet
Date May 18, 2009, 7:21 a.m.
Message ID <4A110C7B.4020407@cosmosbay.com>
Download mbox | patch
Permalink /patch/27340/
State RFC
Delegated to: David Miller
Headers show

Comments

Eric Dumazet - May 18, 2009, 7:21 a.m.
David Miller a écrit :
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Tue, 12 May 2009 21:47:53 +0200
> 
>> struct net_device trans_start field is a hot spot on SMP and high performance
>> devices, particularly multi queues ones, because every transmitter dirties
>> it. Is main use is tx watchdog and bonding alive checks.
>>
>> But as most devices dont use NETIF_F_LLTX, we have to lock
>> a netdev_queue before calling their ndo_start_xmit(). So it makes
>> sense to move trans_start from net_device to netdev_queue. Its update
>> will occur on a already present (and in exclusive state) cache line, for
>> free.
>>
>> We can do this transition smoothly. An old driver continue to
>> update dev->trans_start, while an updated one updates txq->trans_start.
>>
>> Further patches could also put tx_bytes/tx_packets counters in 
>> netdev_queue to avoid dirtying dev->stats (vlan device comes to mind)
>>
>> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> 
> I like this patch so I'm adding it to net-next-2.6
> 
> But you can go one step further and untangle what is arguably
> a huge mess.  There is no reason for the driver to set
> ->trans_start.
> 
> The driver gives us a success indication from ->hard_start_xmit()
> and we can use that to set txq->trans_start.
> 
> Then you can kill the driver code that sets dev->trans_start and
> then kill the netdev struct member as well.
> 
> Could you please do this work, thanks?

Indeed we can do this factorization, just have to take care of not adding
a new cache line miss on loopback device, as this one doesnt touch dev->trans_start.

Adding the update in dev_hard_start_xmit() would cost two additional tests...

if (rc == NETDEV_TX_OK && dev->netdev_ops->ndo_tx_timeout)
	txq->trans_start = jiffies;

Relying on NETIF_F_LLTX is a litle bit odd, since some drivers would
have to set txq->trans_start themselves, but unfortunatly ndo_start_xmit(skb, dev)
doesnt have txq as a third parameter. (we should add it, as high performance drivers 
have to recompute txq themselves). Ah, yes old drivers have one tx queue so it is
cheap to get txq, never mind :)



What do you think ? (following patch as RFC only, not even booted)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - May 18, 2009, 10:11 p.m.
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Mon, 18 May 2009 09:21:31 +0200

> Indeed we can do this factorization, just have to take care of not
> adding a new cache line miss on loopback device, as this one doesnt
> touch dev->trans_start.

If you want to test IFF_LOOPBACK or similar, I can accept that.
Or it can be some NETIF_F_* flag.

> What do you think ? (following patch as RFC only, not even booted)

This patch looks fine to me.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index cd547d0..fe301b0 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1773,8 +1773,10 @@  static inline void netif_tx_unlock_bh(struct net_device *dev)
 	}						\
 }
 
-#define HARD_TX_UNLOCK(dev, txq) {			\
+#define HARD_TX_UNLOCK(dev, txq, update_trans) {	\
 	if ((dev->features & NETIF_F_LLTX) == 0) {	\
+		if (update_trans)			\
+			txq->trans_start = jiffies;	\
 		__netif_tx_unlock(txq);			\
 	}						\
 }
diff --git a/net/core/dev.c b/net/core/dev.c
index 14dd725..f71a69d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1879,11 +1879,11 @@  gso:
 			if (!netif_tx_queue_stopped(txq)) {
 				rc = 0;
 				if (!dev_hard_start_xmit(skb, dev, txq)) {
-					HARD_TX_UNLOCK(dev, txq);
+					HARD_TX_UNLOCK(dev, txq, 1);
 					goto out;
 				}
 			}
-			HARD_TX_UNLOCK(dev, txq);
+			HARD_TX_UNLOCK(dev, txq, 0);
 			if (net_ratelimit())
 				printk(KERN_CRIT "Virtual device %s asks to "
 				       "queue packet!\n", dev->name);
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 27d0381..2f69420 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -145,7 +145,7 @@  static inline int qdisc_restart(struct Qdisc *q)
 	if (!netif_tx_queue_stopped(txq) &&
 	    !netif_tx_queue_frozen(txq))
 		ret = dev_hard_start_xmit(skb, dev, txq);
-	HARD_TX_UNLOCK(dev, txq);
+	HARD_TX_UNLOCK(dev, txq, ret == NETDEV_TX_OK);
 
 	spin_lock(root_lock);