Patchwork ppp: Don't stop and restart queue on every TX packet

login
register
mail settings
Submitter David Woodhouse
Date March 26, 2012, 10:03 a.m.
Message ID <1332756222.2379.66.camel@shinybook.infradead.org>
Download mbox | patch
Permalink /patch/148682/
State Accepted
Delegated to: David Miller
Headers show

Comments

David Woodhouse - March 26, 2012, 10:03 a.m.
For every transmitted packet, ppp_start_xmit() will stop the netdev
queue and then, if appropriate, restart it. This causes the TX softirq
to run, entirely gratuitously.

This is "only" a waste of CPU time in the normal case, but it's actively
harmful when the PPP device is a TEQL slave — the wakeup will cause the
offending device to receive the next TX packet from the TEQL queue, when
it *should* have gone to the next slave in the list. We end up seeing
large bursts of packets on just *one* slave device, rather than using
the full available bandwidth over all slaves.

This patch fixes the problem by *not* unconditionally stopping the queue
in ppp_start_xmit(). It adds a return value from ppp_xmit_process()
which indicates whether the queue should be stopped or not.

It *doesn't* remove the call to netif_wake_queue() from
ppp_xmit_process(), because other code paths (especially from
ppp_output_wakeup()) need it there and it's messy to push it out to the
other callers to do it based on the return value. So we leave it in
place — it's a no-op in the case where the queue wasn't stopped, so it's
harmless in the TX path.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Miller - April 3, 2012, 9:29 p.m.
From: David Woodhouse <dwmw2@infradead.org>
Date: Mon, 26 Mar 2012 11:03:42 +0100

> For every transmitted packet, ppp_start_xmit() will stop the netdev
> queue and then, if appropriate, restart it. This causes the TX softirq
> to run, entirely gratuitously.
> 
> This is "only" a waste of CPU time in the normal case, but it's actively
> harmful when the PPP device is a TEQL slave ― the wakeup will cause the
> offending device to receive the next TX packet from the TEQL queue, when
> it *should* have gone to the next slave in the list. We end up seeing
> large bursts of packets on just *one* slave device, rather than using
> the full available bandwidth over all slaves.
> 
> This patch fixes the problem by *not* unconditionally stopping the queue
> in ppp_start_xmit(). It adds a return value from ppp_xmit_process()
> which indicates whether the queue should be stopped or not.
> 
> It *doesn't* remove the call to netif_wake_queue() from
> ppp_xmit_process(), because other code paths (especially from
> ppp_output_wakeup()) need it there and it's messy to push it out to the
> other callers to do it based on the return value. So we leave it in
> place ― it's a no-op in the case where the queue wasn't stopped, so it's
> harmless in the TX path.
> 
> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

Applied, thanks.  But:

> --- drivers/net/ppp/ppp_generic.c~	2012-01-26 00:39:32.000000000 +0000
> +++ drivers/net/ppp/ppp_generic.c	2012-03-26 10:32:31.286744147 +0100

Please -p1 root your patches in the future.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Woodhouse - April 8, 2012, 7:58 p.m.
On Tue, 2012-04-03 at 17:29 -0400, David Miller wrote:
> From: David Woodhouse <dwmw2@infradead.org>
> Date: Mon, 26 Mar 2012 11:03:42 +0100
> 
> > For every transmitted packet, ppp_start_xmit() will stop the netdev
> > queue and then, if appropriate, restart it. This causes the TX softirq
> > to run, entirely gratuitously.
> > 
> > This is "only" a waste of CPU time in the normal case, but it's actively
> > harmful when the PPP device is a TEQL slave ― the wakeup will cause the
> > offending device to receive the next TX packet from the TEQL queue, when
> > it *should* have gone to the next slave in the list. We end up seeing
> > large bursts of packets on just *one* slave device, rather than using
> > the full available bandwidth over all slaves.
> > 
> > This patch fixes the problem by *not* unconditionally stopping the queue
> > in ppp_start_xmit(). It adds a return value from ppp_xmit_process()
> > which indicates whether the queue should be stopped or not.
> > 
> > It *doesn't* remove the call to netif_wake_queue() from
> > ppp_xmit_process(), because other code paths (especially from
> > ppp_output_wakeup()) need it there and it's messy to push it out to the
> > other callers to do it based on the return value. So we leave it in
> > place ― it's a no-op in the case where the queue wasn't stopped, so it's
> > harmless in the TX path.
> > 
> > Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
> 
> Applied, thanks. 

Hmm. After going through the locking issues on wakeup, for the PPPoATM
patch I just sent, I think that the above has introduced a bug. I'll
send a patch with a full explanation in a few moments...

Patch

--- drivers/net/ppp/ppp_generic.c~	2012-01-26 00:39:32.000000000 +0000
+++ drivers/net/ppp/ppp_generic.c	2012-03-26 10:32:31.286744147 +0100
@@ -235,7 +235,7 @@  struct ppp_net {
 /* Prototypes. */
 static int ppp_unattached_ioctl(struct net *net, struct ppp_file *pf,
 			struct file *file, unsigned int cmd, unsigned long arg);
-static void ppp_xmit_process(struct ppp *ppp);
+static int ppp_xmit_process(struct ppp *ppp);
 static void ppp_send_frame(struct ppp *ppp, struct sk_buff *skb);
 static void ppp_push(struct ppp *ppp);
 static void ppp_channel_push(struct channel *pch);
@@ -968,9 +968,9 @@  ppp_start_xmit(struct sk_buff *skb, stru
 	proto = npindex_to_proto[npi];
 	put_unaligned_be16(proto, pp);
 
-	netif_stop_queue(dev);
 	skb_queue_tail(&ppp->file.xq, skb);
-	ppp_xmit_process(ppp);
+	if (!ppp_xmit_process(ppp))
+		netif_stop_queue(dev);
 	return NETDEV_TX_OK;
 
  outf:
@@ -1048,10 +1048,11 @@  static void ppp_setup(struct net_device
  * Called to do any work queued up on the transmit side
  * that can now be done.
  */
-static void
+static int
 ppp_xmit_process(struct ppp *ppp)
 {
 	struct sk_buff *skb;
+	int ret = 0;
 
 	ppp_xmit_lock(ppp);
 	if (!ppp->closing) {
@@ -1061,10 +1062,13 @@  ppp_xmit_process(struct ppp *ppp)
 			ppp_send_frame(ppp, skb);
 		/* If there's no work left to do, tell the core net
 		   code that we can accept some more. */
-		if (!ppp->xmit_pending && !skb_peek(&ppp->file.xq))
+		if (!ppp->xmit_pending && !skb_peek(&ppp->file.xq)) {
 			netif_wake_queue(ppp->dev);
+			ret = 1;
+		}
 	}
 	ppp_xmit_unlock(ppp);
+	return ret;
 }
 
 static inline struct sk_buff *