diff mbox

NAPI: Polling function requesting extra calls - difference between 3.18.11 and 4.0.0

Message ID CACna6rzMMUA_MQemYyP93+TjP4aocL1WaYQUJFwQm3+DLNAefg@mail.gmail.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Rafał Miłecki April 23, 2015, 6:18 p.m. UTC
Hi,

Recently Felix improved bgmac driver to be smarter in situation where
new packets were Tx/Rx-ed during bgmac_poll execution. It was handled
in:
eb64e29 bgmac: leave interrupts disabled as long as there is work to do
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=eb64e2923a886441c7b322f138b36029f3fa6a36
With above change, after handling all skb-s in bgmac_poll, bgmac
checks if there are new sbk-s to be handled. If so, it doesn't call
napi_complete & it doesn't enable interrupts.

Above commit was merged for 4.1 kernel, but I work with OpenWrt based
on older releases, so I have all bgmac changes backported to 3.8.11
and 4.0.


With 3.8.11 Felix's change seems to be working fine, I did some debugging:

[   44.970000] [bgmac_interrupt] int_status & int_mask = 0x01000000

[   44.980000] [bgmac_poll] START weight:64
[   44.990000] [bgmac_poll] Read handled:0 skb-s
[   44.990000] [bgmac_poll] Read IRQ status again BGMAC_IS_TX0:0 BGMAC_IS_RX:0
[   45.000000] [bgmac_poll] Read skb-s < weight, calling napi_complete, en IRQs
[   45.010000] [bgmac_poll] END returning handled:0

[   48.040000] [bgmac_interrupt] int_status & int_mask = 0x01000000

[   48.050000] [bgmac_poll] START weight:64
[   48.060000] [bgmac_poll] Read handled:0 skb-s
[   48.060000] [bgmac_poll] Read IRQ status again BGMAC_IS_TX0:0 BGMAC_IS_RX:0
[   48.070000] [bgmac_poll] Read skb-s < weight, calling napi_complete, en IRQs
[   48.080000] [bgmac_poll] END returning handled:0

[   48.100000] [bgmac_poll] START weight:64
[   48.110000] [bgmac_poll] Read handled:1 skb-s
[   48.110000] [bgmac_poll] Read IRQ status again BGMAC_IS_TX0:1 BGMAC_IS_RX:1
[   48.120000] [bgmac_poll] END more skb-s to handle, returning handled:1

[   48.130000] [bgmac_poll] START weight:64
[   48.130000] [bgmac_poll] Read handled:1 skb-s
[   48.140000] [bgmac_poll] Read IRQ status again BGMAC_IS_TX0:0 BGMAC_IS_RX:0
[   48.150000] [bgmac_poll] Read skb-s < weight, calling napi_complete, en IRQs
[   48.160000] [bgmac_poll] END returning handled:1

Please note that at 48.120000 bgmac decided bgmac_poll should be
called again and it didn't call napi_complete & didn't enable IRQs. It
simply returned 1.
NAPI behaved just like we expected, bgmac_poll was called again a
moment later. It handled skb-s that appeared meanwhile and ended
calling napi_complete and returning 1.


Now, it fails badly for me at all when using kernel 4.0.0:

[   52.800000] [bgmac_interrupt] int_status & int_mask = 0x01000000

[   52.800000] [bgmac_poll] START weight:64
[   52.810000] [bgmac_poll] Read handled:0 skb-s
[   52.810000] [bgmac_poll] Read IRQ status again BGMAC_IS_TX0:0 BGMAC_IS_RX:0
[   52.820000] [bgmac_poll] Read skb-s < weight, calling napi_complete, en IRQs
[   52.830000] [bgmac_poll] END returning handled:0

[   54.610000] [bgmac_interrupt] int_status & int_mask = 0x01000000

[   54.620000] [bgmac_poll] START weight:64
[   54.630000] [bgmac_poll] Read handled:0 skb-s
[   54.630000] [bgmac_poll] Read IRQ status again BGMAC_IS_TX0:0 BGMAC_IS_RX:0
[   54.640000] [bgmac_poll] Read skb-s < weight, calling napi_complete, en IRQs
[   54.650000] [bgmac_poll] END returning handled:0

[   55.900000] [bgmac_interrupt] int_status & int_mask = 0x01000000

[   55.910000] [bgmac_poll] START weight:64
[   55.920000] [bgmac_poll] Read handled:1 skb-s
[   55.920000] [bgmac_poll] Read IRQ status again BGMAC_IS_TX0:1 BGMAC_IS_RX:1
[   55.930000] [bgmac_poll] END more skb-s to handle, returning handled:1

As you can see, at 55.930000 bgmac also decided bgmac_poll should be
called again. It didn't call napi_complete, didn't enable IRQs. It
simply returned 1. Just like with kernel 3.8.11.
It this case however, NAPI never called bgmac_poll again. It resulted
in bgmac hanging (IRQs are off, bgmac expects bgmac_poll to be
called).


Can you help us with this, please? Does bgmac use NAPI incorrectly?
Were there some important changes in 3.19 or 4.0?

Comments

Eric Dumazet April 23, 2015, 6:28 p.m. UTC | #1
On Thu, 2015-04-23 at 20:18 +0200, Rafał Miłecki wrote:

> 
> Can you help us with this, please? Does bgmac use NAPI incorrectly?
> Were there some important changes in 3.19 or 4.0?
> 

Right they were important changes in NAPI indeed :

3b47d30396ba net: gro: add a per device gro flush timer
d75b1ade567f net: less interrupt masking in NAPI
bc9ad166e38a net: introduce napi_schedule_irqoff()

This requested some fixes in various drivers :

commit f31ec95fa19e07a8beebcc0297284f23aa57967e
commit 24e579c8898aa641ede3149234906982290934e5
commit 6088beef3f7517717bd21d90b379714dd0837079
commit f104fedc0da126abe93dd0f4a9fa13e5133bf9df
commit 7a05dc64e2e4c611d89007b125b20c0d2a4d31a5
commit 001ce546bb537bb5b7821f05633556a0c9787e32
commit 3079c652141f9d6377417a7e8fd650c9948df65e
commit 8acdf999accfd95093db17f33a58429a38782060
commit 6a6dc08ff6395f58be3ee568cb970ea956f16819


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafał Miłecki April 23, 2015, 7:08 p.m. UTC | #2
On 23 April 2015 at 20:28, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2015-04-23 at 20:18 +0200, Rafał Miłecki wrote:
>
>>
>> Can you help us with this, please? Does bgmac use NAPI incorrectly?
>> Were there some important changes in 3.19 or 4.0?
>>
>
> Right they were important changes in NAPI indeed :
>
> 3b47d30396ba net: gro: add a per device gro flush timer
> d75b1ade567f net: less interrupt masking in NAPI
> bc9ad166e38a net: introduce napi_schedule_irqoff()

As you obviously noticed, I fixed bgmac now and submitted the fix.

Just wanted to say a thanks for helping with that!
Eric Dumazet April 23, 2015, 7:27 p.m. UTC | #3
On Thu, 2015-04-23 at 21:08 +0200, Rafał Miłecki wrote:

> As you obviously noticed, I fixed bgmac now and submitted the fix.
> 
> Just wanted to say a thanks for helping with that!
> 

Sure, it was a pleasure ;)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c
index 4929932..40ce34d 100644
--- a/drivers/net/ethernet/broadcom/bgmac.c
+++ b/drivers/net/ethernet/broadcom/bgmac.c
@@ -1219,6 +1219,7 @@  static irqreturn_t bgmac_interrupt(int irq, void *dev_id)
 	struct bgmac *bgmac = netdev_priv(dev_id);
 
 	u32 int_status = bgmac_read(bgmac, BGMAC_INT_STATUS);
+	pr_info("[%s] int_status:0x%08x & int_mask:0x%08x = 0x%08x\n", __FUNCTION__, int_status, bgmac->int_mask, int_status & bgmac->int_mask);
 	int_status &= bgmac->int_mask;
 
 	if (!int_status)
@@ -1240,22 +1241,31 @@  static int bgmac_poll(struct napi_struct *napi, int weight)
 {
 	struct bgmac *bgmac = container_of(napi, struct bgmac, napi);
 	int handled = 0;
+	u32 int_status;
 
+	pr_info("[%s] START weight:%d\n", __FUNCTION__, weight);
 	/* Ack */
 	bgmac_write(bgmac, BGMAC_INT_STATUS, ~0);
 
 	bgmac_dma_tx_free(bgmac, &bgmac->tx_ring[0]);
 	handled += bgmac_dma_rx_read(bgmac, &bgmac->rx_ring[0], weight);
+	pr_info("[%s] Read handled:%d skb-s\n", __FUNCTION__, handled);
 
 	/* Poll again if more events arrived in the meantime */
-	if (bgmac_read(bgmac, BGMAC_INT_STATUS) & (BGMAC_IS_TX0 | BGMAC_IS_RX))
+	int_status = bgmac_read(bgmac, BGMAC_INT_STATUS);
+	pr_info("[%s] Read IRQ status again BGMAC_IS_TX0:%d BGMAC_IS_RX:%d\n", __FUNCTION__, !!(int_status & BGMAC_IS_TX0), !!(int_status & BGMAC_IS_RX));
+	if (int_status & (BGMAC_IS_TX0 | BGMAC_IS_RX)) {
+		pr_info("[%s] END more skb-s to handle, returning handled:%d\n", __FUNCTION__, handled);
 		return handled;
+	}
 
 	if (handled < weight) {
+		pr_info("[%s] Read skb-s < weight, calling napi_complete, en IRQs\n", __FUNCTION__);
 		napi_complete(napi);
 		bgmac_chip_intrs_on(bgmac);
 	}
 
+	pr_info("[%s] END returning handled:%d\n", __FUNCTION__, handled);
 	return handled;
 }