Patchwork r8169: Fix irq masking in rtl8169_interrupt()

login
register
mail settings
Submitter Karsten Wiese
Date March 17, 2009, 7:34 p.m.
Message ID <200903172034.55241.fzu@wemgehoertderstaat.de>
Download mbox | patch
Permalink /patch/24572/
State Deferred
Delegated to: David Miller
Headers show

Comments

Karsten Wiese - March 17, 2009, 7:34 p.m.
Only adjust the NAPI bits of the interruptmask, if netif_rx_schedule_prep()
returns true.
Without this, I see interrupt storms here using 2.6.29-rcx and 2.6.27.19 and
device needed rmmod r8169 + modprobe r8169 to start working.

Signed-off-by: Karsten Wiese <fzu@wemgehoertderstaat.de>
---
 drivers/net/r8169.c |    9 ++++-----
 1 files changed, 4 insertions(+), 5 deletions(-)
françois romieu - March 17, 2009, 10:06 p.m.
Karsten Wiese <fzu@wemgehoertderstaat.de> :
> 
> Only adjust the NAPI bits of the interruptmask, if netif_rx_schedule_prep()
> returns true.
> Without this, I see interrupt storms here using 2.6.29-rcx and 2.6.27.19 and
> device needed rmmod r8169 + modprobe r8169 to start working.

How the patch is supposed to work is not completely clear to me.

Can you elaborate a bit ?
Karsten Wiese - March 18, 2009, 1:56 a.m.
Am Dienstag 17 März 2009 schrieb Francois Romieu:
> Karsten Wiese <fzu@wemgehoertderstaat.de> :
> > 
> > Only adjust the NAPI bits of the interruptmask, if netif_rx_schedule_prep()
> > returns true.
> > Without this, I see interrupt storms here using 2.6.29-rcx and 2.6.27.19 and
> > device needed rmmod r8169 + modprobe r8169 to start working.
> 
> How the patch is supposed to work is not completely clear to me.

Without it, if netif_rx_schedule_prep() returns 0, __netif_rx_schedule() is not
called so nothing is there to care for the transfer. Nonetheless intr_mask is 
changed so that its not possible that __netif_rx_schedule() is ever called
later.

Patch works by letting rtl8169_interrupt() again call netif_rx_schedule_prep()
until it succeeds and __netif_rx_schedule() is called.

Thanks,
Karsten
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
françois romieu - March 18, 2009, 6:40 a.m.
Karsten Wiese <fzu@wemgehoertderstaat.de> :
> Am Dienstag 17 März 2009 schrieb Francois Romieu:
[...]
> > How the patch is supposed to work is not completely clear to me.
> 
> Without it, if netif_rx_schedule_prep() returns 0, __netif_rx_schedule() is not
> called so nothing is there to care for the transfer. Nonetheless intr_mask is 
> changed so that its not possible that __netif_rx_schedule() is ever called
> later.
> 
> Patch works by letting rtl8169_interrupt() again call netif_rx_schedule_prep()
> until it succeeds and __netif_rx_schedule() is called.

Thanks for the explanation.

If netif_rx_schedule fails after napi is enabled, there is a
racing poll thread to care for the transfer.

Stated differently the bug is noticeable because napi_enable() 
should be called before request_irq() in rtl8169_open, right ?
Karsten Wiese - March 18, 2009, 11:59 a.m.
Am Mittwoch 18 März 2009 schrieb Francois Romieu:
> Karsten Wiese <fzu@wemgehoertderstaat.de> :
> > Am Dienstag 17 März 2009 schrieb Francois Romieu:
> [...]
> > > How the patch is supposed to work is not completely clear to me.
> > 
> > Without it, if netif_rx_schedule_prep() returns 0, __netif_rx_schedule() is not
> > called so nothing is there to care for the transfer. Nonetheless intr_mask is 
> > changed so that its not possible that __netif_rx_schedule() is ever called
> > later.
> > 
> > Patch works by letting rtl8169_interrupt() again call netif_rx_schedule_prep()
> > until it succeeds and __netif_rx_schedule() is called.
> 
> Thanks for the explanation.
> 
> If netif_rx_schedule fails after napi is enabled, there is a
> racing poll thread to care for the transfer.
> 
> Stated differently the bug is noticeable because napi_enable() 
> should be called before request_irq() in rtl8169_open, right ?

Propably yes. That would mean the first interrupt is received before
rtl_hw_start() is called in rtl8169_open. The source of that
interrupt would be something else then?

Thanks,
Karsten
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
françois romieu - March 18, 2009, 10:36 p.m.
Karsten Wiese <fzu@wemgehoertderstaat.de> :
> Am Mittwoch 18 März 2009 schrieb Francois Romieu:
[...]
> > If netif_rx_schedule fails after napi is enabled, there is a
> > racing poll thread to care for the transfer.
> > 
> > Stated differently the bug is noticeable because napi_enable() 
> > should be called before request_irq() in rtl8169_open, right ?
> 
> Propably yes. That would mean the first interrupt is received before
> rtl_hw_start() is called in rtl8169_open. The source of that
> interrupt would be something else then?

All we need is thus a shared interrupt racing on a different CPU
with rtl8169_open and some IO versus memory reordering so that
the hardware is started and a change in the status register can
be seen in the IRQ handler before napi_enable/NAPI_STATE_SCHED
escapes from the CPU where rtl8169_open is running...

It does not look like a credible explanation. :o/
David Miller - March 19, 2009, 6:35 a.m.
From: Francois Romieu <romieu@fr.zoreil.com>
Date: Wed, 18 Mar 2009 23:36:29 +0100

> Karsten Wiese <fzu@wemgehoertderstaat.de> :
> > Am Mittwoch 18 März 2009 schrieb Francois Romieu:
> [...]
> > > If netif_rx_schedule fails after napi is enabled, there is a
> > > racing poll thread to care for the transfer.
> > > 
> > > Stated differently the bug is noticeable because napi_enable() 
> > > should be called before request_irq() in rtl8169_open, right ?
> > 
> > Propably yes. That would mean the first interrupt is received before
> > rtl_hw_start() is called in rtl8169_open. The source of that
> > interrupt would be something else then?
> 
> All we need is thus a shared interrupt racing on a different CPU
> with rtl8169_open and some IO versus memory reordering so that
> the hardware is started and a change in the status register can
> be seen in the IRQ handler before napi_enable/NAPI_STATE_SCHED
> escapes from the CPU where rtl8169_open is running...
> 
> It does not look like a credible explanation. :o/

Agreed, looking at how this part of the driver works I can't
see how this condition can arise either.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Karsten Wiese - March 19, 2009, 10:58 a.m.
Am Mittwoch 18 März 2009 schrieb Francois Romieu:
> Karsten Wiese <fzu@wemgehoertderstaat.de> :
> > Am Mittwoch 18 März 2009 schrieb Francois Romieu:
> [...]
> > > If netif_rx_schedule fails after napi is enabled, there is a
> > > racing poll thread to care for the transfer.
> > > 
> > > Stated differently the bug is noticeable because napi_enable() 
> > > should be called before request_irq() in rtl8169_open, right ?
> > 
> > Propably yes. That would mean the first interrupt is received before
> > rtl_hw_start() is called in rtl8169_open. The source of that
> > interrupt would be something else then?
> 
> All we need is thus a shared interrupt racing on a different CPU
> with rtl8169_open and some IO versus memory reordering so that
> the hardware is started and a change in the status register can
> be seen in the IRQ handler before napi_enable/NAPI_STATE_SCHED
> escapes from the CPU where rtl8169_open is running...
> 
> It does not look like a credible explanation. :o/

Found it: CONFIG_DEBUG_SHIRQ is set here causing a "test-call" to
rtl8169_interrupt() from irq_request().

$SUBJECT patch is ok, me still thinks. You?

Thanks,
Karsten
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
françois romieu - March 19, 2009, 10:23 p.m.
Karsten Wiese <fzu@wemgehoertderstaat.de> :
[...]
> Found it: CONFIG_DEBUG_SHIRQ is set here causing a "test-call" to
> rtl8169_interrupt() from irq_request().
> 
> $SUBJECT patch is ok, me still thinks. You ?

I will have to convince myself that nothing bad can happen in the
window between netif_rx_schedule_prep and RTL_W16. No big deal.

I do not understand why / how the patch works at all : AFAIU, the status
register must contain some napi-related event in order for the patch to
make a difference. That's exactly what rtl8169_irq_mask_and_ack() is
supposed to prevent. -> Does it fail ?

So I'd really like to understand what is going on here.

PS: while I agree on the motivation for CONFIG_DEBUG_SHIRQ, I wonder if
it is 100% safe to run an IRQ handler from a context which does not
adhere to the original semantic. If its {disable/enable}_irq pair races
with a similar pair due to a different driver on the same (shared) irq,
things could imho be unsafe.
Karsten Wiese - March 20, 2009, 1:40 a.m.
Am Donnerstag 19 März 2009 schrieb Francois Romieu:
> Karsten Wiese <fzu@wemgehoertderstaat.de> :
> [...]
> > Found it: CONFIG_DEBUG_SHIRQ is set here causing a "test-call" to
> > rtl8169_interrupt() from irq_request().
> > 
> > $SUBJECT patch is ok, me still thinks. You ?
> 
> I will have to convince myself that nothing bad can happen in the
> window between netif_rx_schedule_prep and RTL_W16. No big deal.

RTL_W16 is a posted write, if it happens, theres no difference if its
posted before or after netif_rx_schedule_prep.
 
> I do not understand why / how the patch works at all : AFAIU, the status
> register must contain some napi-related event in order for the patch to
> make a difference.

It contains 0x25 when the "test-call" runs, Link, Rx and Tx Flags.

> That's exactly what rtl8169_irq_mask_and_ack() is
> supposed to prevent. -> Does it fail ?

rtl8169_irq_mask_and_ack() resets status once and prevents interrupts
being sent by device. Maybe status reset doesn't work because device is
stopped or device isn't _completely_ stopped and updates status flags later.
Maybe Tx status only becomes 0 if some Tx data is queued.
 
> So I'd really like to understand what is going on here.

Well, without patch any further interrupt causing napi-status isn't reset
_and_ netif_rx_schedule() isn't active. screaming interrupts.
slooooowish pc. Wonder why it proceeded at all though.

After rmmod/modprobe the status register contained 0x20 when in
rtl8169_irq_mask_and_ack() during "test-call", letting device work
normally.
 
> PS: while I agree on the motivation for CONFIG_DEBUG_SHIRQ, I wonder if
> it is 100% safe to run an IRQ handler from a context which does not
> adhere to the original semantic. If its {disable/enable}_irq pair races
> with a similar pair due to a different driver on the same (shared) irq,
> things could imho be unsafe.

From the comment of disable_irq():
	"This function waits for any pending IRQ handlers for this interrupt
 	to complete before returning."
No race possible, in theory ;-)

The pc I debugged this on is a UP with no other device active on the used
int16.

Thanks,
Karsten

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
françois romieu - March 22, 2009, 9:16 p.m.
Karsten Wiese <fzu@wemgehoertderstaat.de> :
[...]
> The pc I debugged this on is a UP with no other device active on the used
> int16.

/me wonders...

Does your network adapter support pxe or some boot time operation ?

Btw the XID of the device would be welcome.
Karsten Wiese - March 22, 2009, 11:53 p.m.
Am Sonntag 22 März 2009 schrieb Francois Romieu:
> Karsten Wiese <fzu@wemgehoertderstaat.de> :
> [...]
> > The pc I debugged this on is a UP with no other device active on the used
> > int16.
> 
> /me wonders...
> 
> Does your network adapter support pxe or some boot time operation ?

Yes, deavtivated in BIOS.

> Btw the XID of the device would be welcome.

XID is 04000000

Thanks,
Karsten
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index b347340..0c943ba 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -3708,12 +3708,11 @@  static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
 		rtl8169_check_link_status(dev, tp, ioaddr);
 
 	if (status & tp->napi_event) {
-		RTL_W16(IntrMask, tp->intr_event & ~tp->napi_event);
-		tp->intr_mask = ~tp->napi_event;
-
-		if (likely(netif_rx_schedule_prep(&tp->napi)))
+		if (likely(netif_rx_schedule_prep(&tp->napi))) {
+			RTL_W16(IntrMask, tp->intr_event & ~tp->napi_event);
+			tp->intr_mask = ~tp->napi_event;
 			__netif_rx_schedule(&tp->napi);
-		else if (netif_msg_intr(tp)) {
+		} else if (netif_msg_intr(tp)) {
 			printk(KERN_INFO "%s: interrupt %04x in poll\n",
 			       dev->name, status);
 		}