Patchwork mv643xx_eth: fix SMI bus access timeouts

login
register
mail settings
Submitter Lennert Buytenhek
Date Nov. 1, 2008, 5:48 a.m.
Message ID <20081101054859.GB13348@xi.wantstofly.org>
Download mbox | patch
Permalink /patch/6771/
State Not Applicable
Delegated to: Jeff Garzik
Headers show

Comments

Lennert Buytenhek - Nov. 1, 2008, 5:48 a.m.
On Sat, Nov 01, 2008 at 06:32:20AM +0100, Lennert Buytenhek wrote:

> The mv643xx_eth mii bus implementation uses wait_event_timeout() to
> wait for SMI completion interrupts.
> 
> If wait_event_timeout() would return zero, mv643xx_eth would conclude
> that the SMI access timed out, but this is not necessarily true --
> wait_event_timeout() can also return zero in the case where the SMI
> completion interrupt did happen in time but where it took longer than
> the requested timeout for the process performing the SMI access to be
> scheduled again.  This would lead to occasional SMI access timeouts
> when the system would be under heavy load.

FWIW, where I've been seeing this is mostly during heavy softirq
load, e.g. when doing routing when the box can't keep up.

When the system is being hammered like this, simple things like
querying the switch chip for its statistics counters (by doing
"ethtool -S <switch interface>") can take seconds, since querying the
hardware switch stats consists of doing a lot of MII accesses, and
each of those MII accesses takes tens of milliseconds to return because
the issuer of the MII access goes to sleep after issuing the MII access
waiting for an MII done interrupt, but won't get scheduled again to
issue its next MII access until the rx softirq has decided that it has
done enough looping.

This patch makes it a lot more bearable -- but it's still only a bit
of a stopgap:


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/kernel/softirq.c b/kernel/softirq.c
index c506f26..f7fd630 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -215,7 +215,7 @@  restart:
 	local_irq_disable();
 
 	pending = local_softirq_pending();
-	if (pending && --max_restart)
+	if (pending && !need_resched() && --max_restart)
 		goto restart;
 
 	if (pending)