Message ID | 1366136460-30732-1-git-send-email-bigeasy@linutronix.de |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
On 4/16/2013 11:51 PM, Sebastian Andrzej Siewior wrote: > This driver has usually four interrupts registered with the same ISR. > The ISR masks the interrupt source in DMA engine and CPSW, disables the > interrupt and schedules NAPI. After the NAPI routine completes, the > source is activated and interrupts are re-enabled. > With threaded-interrupts enabled, the enable/disable might go wrong. > After the RX interrupt arrived, the core marks the interrupt pending. If > the TX interrupt arrives before the RX started, then both lines are > marked pending and both disable the interrupt which then is disabled > twice. In NAPI complete the interrupt is enabled only once which means > the four interrupts are still masked and the device plays dead. > > While playing with this on my beagle bone I didn't understand why > the interrupts are deactivated in the first place. According to my > testing, after calling cpsw_intr_disable() the interrupt is not active > anymore. I haven't seen the ISR being invoked again until after > cpsw_poll() enabled them (except for a different interrupt number). > Therefore I remove this. > > In my testing I didn't notice anything unusual except one thing: I start > a wget of a 126MiB file (which is sttored on MMC) from the beagle bone. > On the first invocation I receive almost steady 5MiB/sec. In the > following invocations of wget I see the performance floating between > 1MiB/sec and 4MiB/sec. It usually ends with overall around 2MiB/sec. > I couldn't notice nothing wrong. The only difference compared to the > first invocation is (most likely) that the file is now served from > memory and not read from MMC which should perform better but not worse. > After executing "while ((1)); do /bin/true; done" on the beagle bone > while the file was downloaded, I noticed that the speed rose to 9MiB/sec > - 10/MiB/sec. Now this looks like power/idle optimization on the beagle > bone side. > > Cc: Thomas Gleixner <tglx@linutronix.de> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > --- > drivers/net/ethernet/ti/cpsw.c | 16 ---------------- > 1 file changed, 16 deletions(-) > > diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c > index e2ba702..acb229c 100644 > --- a/drivers/net/ethernet/ti/cpsw.c > +++ b/drivers/net/ethernet/ti/cpsw.c > @@ -133,19 +133,6 @@ do { \ > #define CPSW_CMINTMAX_INTVL (1000 / CPSW_CMINTMIN_CNT) > #define CPSW_CMINTMIN_INTVL ((1000 / CPSW_CMINTMAX_CNT) + 1) > > -#define cpsw_enable_irq(priv) \ > - do { \ > - u32 i; \ > - for (i = 0; i < priv->num_irqs; i++) \ > - enable_irq(priv->irqs_table[i]); \ > - } while (0); > -#define cpsw_disable_irq(priv) \ > - do { \ > - u32 i; \ > - for (i = 0; i < priv->num_irqs; i++) \ > - disable_irq_nosync(priv->irqs_table[i]); \ > - } while (0); > - > #define cpsw_slave_index(priv) \ > ((priv->data.dual_emac) ? priv->emac_port : \ > priv->data.active_slave) > @@ -513,13 +500,11 @@ static irqreturn_t cpsw_interrupt(int irq, void *dev_id) > > if (likely(netif_running(priv->ndev))) { > cpsw_intr_disable(priv); > - cpsw_disable_irq(priv); > napi_schedule(&priv->napi); > } else { > priv = cpsw_get_slave_priv(priv, 1); > if (likely(priv) && likely(netif_running(priv->ndev))) { > cpsw_intr_disable(priv); > - cpsw_disable_irq(priv); > napi_schedule(&priv->napi); > } > } > @@ -540,7 +525,6 @@ static int cpsw_poll(struct napi_struct *napi, int budget) > napi_complete(napi); > cpsw_intr_enable(priv); > cpdma_ctlr_eoi(priv->dma, CPDMA_EOI_RX); > - cpsw_enable_irq(priv); > } > > if (num_rx || num_tx) When using cpsw_intr_disable, it actually only disables future interrupts from CPSW ip. But the current interrupt generated to interrupt controller will be still pending and will not allow ARM to do any thing till either interrups is disabled or acked, thus CPU will be in CPSW ISR continuously and system will hang. This patch is applicable only when kernel is always built with RT enabled which is not the current case in Vanilla kernel I have tested this patch and it hangs the CPU after net open (CPSW init). Regards Mugunthan V N -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/17/2013 08:19 AM, Mugunthan V N wrote: > When using cpsw_intr_disable, it actually only disables future interrupts > from CPSW ip. But the current interrupt generated to interrupt controller > will be still pending and will not allow ARM to do any thing till either > interrups is disabled or acked, thus CPU will be in CPSW ISR continuously > and system will hang. This patch is applicable only when kernel is always > built with RT enabled which is not the current case in Vanilla kernel I am not talking about RT, just threaded interrupts but yes I saw the problem first with RT. > I have tested this patch and it hangs the CPU after net open (CPSW init). If I remove additionally the napi_schedule() piece then the network is dead (as expected) and the system continues to work (and I receive interrupts) and the ISR for cpsw is not executed again. What hardware / CPU do you have? > > Regards > Mugunthan V N Sebastian -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 4/17/2013 1:14 PM, Sebastian Andrzej Siewior wrote: >> I have tested this patch and it hangs the CPU after net open (CPSW init). > If I remove additionally the napi_schedule() piece then the network is > dead (as expected) and the system continues to work (and I receive > interrupts) and the ISR for cpsw is not executed again. What hardware / > CPU do you have? > I am testing on Beagle Bone black. Regards Mugunthan V N -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/17/2013 10:46 AM, Mugunthan V N wrote: > On 4/17/2013 1:14 PM, Sebastian Andrzej Siewior wrote: >>> I have tested this patch and it hangs the CPU after net open (CPSW >>> init). >> If I remove additionally the napi_schedule() piece then the network is >> dead (as expected) and the system continues to work (and I receive >> interrupts) and the ISR for cpsw is not executed again. What hardware / >> CPU do you have? >> > I am testing on Beagle Bone black. I have here vanila v3.8.4 + cpsw patches which were in net-next. As for the hardware: [ 0.000000] Machine: Generic AM33XX (Flattened Device Tree), model: TI AM335x BeagleBone [ 0.000000] AM335X ES1.0 (neon ) model name : ARMv7 Processor rev 2 (v7l) CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x3 CPU part : 0xc08 CPU revision : 2 If we do have the same hardware and source why do I have different behavior? Can you send me your .config off-list? > Regards > Mugunthan V N Sebastian -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/16/2013 08:21 PM, Sebastian Andrzej Siewior wrote: > In my testing I didn't notice anything unusual except one thing: I start > a wget of a 126MiB file (which is sttored on MMC) from the beagle bone. > On the first invocation I receive almost steady 5MiB/sec. In the > following invocations of wget I see the performance floating between > 1MiB/sec and 4MiB/sec. It usually ends with overall around 2MiB/sec. > I couldn't notice nothing wrong. The only difference compared to the > first invocation is (most likely) that the file is now served from > memory and not read from MMC which should perform better but not worse. > After executing "while ((1)); do /bin/true; done" on the beagle bone > while the file was downloaded, I noticed that the speed rose to 9MiB/sec > - 10/MiB/sec. Now this looks like power/idle optimization on the beagle > bone side. Yeah, disabling NO_HZ gives 11.2 MiB/s. Sebastian -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 4/17/2013 2:32 PM, Sebastian Andrzej Siewior wrote: > On 04/17/2013 10:46 AM, Mugunthan V N wrote: >> On 4/17/2013 1:14 PM, Sebastian Andrzej Siewior wrote: >>>> I have tested this patch and it hangs the CPU after net open (CPSW >>>> init). >>> If I remove additionally the napi_schedule() piece then the network is >>> dead (as expected) and the system continues to work (and I receive >>> interrupts) and the ISR for cpsw is not executed again. What hardware / >>> CPU do you have? >>> >> I am testing on Beagle Bone black. > I have here vanila v3.8.4 + cpsw patches which were in net-next. As for > the hardware: > > [ 0.000000] Machine: Generic AM33XX (Flattened Device Tree), model: > TI AM335x BeagleBone > [ 0.000000] AM335X ES1.0 (neon ) > > model name : ARMv7 Processor rev 2 (v7l) > CPU implementer : 0x41 > CPU architecture: 7 > CPU variant : 0x3 > CPU part : 0xc08 > CPU revision : 2 > > If we do have the same hardware and source why do I have different > behavior? Can you send me your .config off-list? > Mine shows [ 0.000000] AM335X ES2.0 (neon ) In Beagle bone (silicon revision 1.0) there is a bug in CPSW irq in Silicon, please refer http://www.ti.com/lit/er/sprz360e/sprz360e.pdf Advisory 1.0.9 Beagle bone black has Silicon revision 2.0 where the bug is fixed and you are able to test it properly and it hangs in my bone black as the IRQ is properly connected to A8 Regards Mugunthan V N -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/17/2013 12:08 PM, Mugunthan V N wrote: > Mine shows [ 0.000000] AM335X ES2.0 (neon ) > > In Beagle bone (silicon revision 1.0) there is a bug in CPSW irq in > Silicon, please refer > http://www.ti.com/lit/er/sprz360e/sprz360e.pdf Advisory 1.0.9 > > Beagle bone black has Silicon revision 2.0 where the bug is fixed and > you are able > to test it properly and it hangs in my bone black as the IRQ is properly > connected > to A8 Okay. This would explain things. So let me try it again without breaking the new one. If you disable_irq() there is no need to mask the source in chip, right? > > Regards > Mugunthan V N Sebastian -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 4/17/2013 4:19 PM, Sebastian Andrzej Siewior wrote: > On 04/17/2013 12:08 PM, Mugunthan V N wrote: >> Mine shows [ 0.000000] AM335X ES2.0 (neon ) >> >> In Beagle bone (silicon revision 1.0) there is a bug in CPSW irq in >> Silicon, please refer >> http://www.ti.com/lit/er/sprz360e/sprz360e.pdf Advisory 1.0.9 >> >> Beagle bone black has Silicon revision 2.0 where the bug is fixed and >> you are able >> to test it properly and it hangs in my bone black as the IRQ is properly >> connected >> to A8 > Okay. This would explain things. So let me try it again without > breaking the new one. If you disable_irq() there is no need to mask the > source in chip, right? > Yes, May be we can try. If the performance is more then we can think of removing disabling interrupt in CPSW and use only disable ARM irq. Regards Mugunthan V N -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, I think the drivers/net/usb/asix.c drivers is broken for the C1 hardware of the D-Link DUB E100 USB Ethernet Adapter. I setup my ARM target board with D-Link DUB E100 C1 USB Ethernet Adapter by using ifconfig eth0 172.17.0.1 On a Linux PC directly connected via an Ethernet cable to the target I do ping 172.17.0.1 -c 1 -s 1965 but no response is received by the PC. The ARM target generates errors such as: [ 202.519377] asix 1-2:1.0: eth0: asix_rx_fixup() Bad Header Length [ 202.525593] asix 1-2:1.0: eth0: asix_rx_fixup() Bad RX Length 1925 Note that this following ping works OK by shortening the length by 1: ping 172.17.0.1 -c 1 -s 1964 Note that this ping test causes IP fragmentation to be used. I am using a modified 2.6.34.13 kernel with back-ported changes for the ASIX driver. I note that Linux 3.9-rc7 has an additional change to asix_rx_fixup() that looks helpful. I backported that commit 8b5b6f5413e97c3e8bafcdd67553d508f4f698cd to the ARM now ping 172.17.0.1 -c 1 -s 1965 gives the following run-time kernel errors on the ARM target and the ping does not work [ 1193.872116] asix 1-2:1.0: eth0: asix_rx_fixup() Bad Header Length 0xffff2f13, offset 6 It is possible that my back-porting is wrong or I have commits missing. As the ping test is simple, please can anyone with a D-Link DUB E100 C1 USB Ethernet Adapter run my ping test on any platform to see whether any Linux version has a working ASIX driver for this C1 hardware. I note that the ASIX driver split into multiple files in Linux 3.6 which complicates back-porting. Thanks for any observations of a working ASIX driver for this C1 hardware variant using my ping test. Regards, Dean Jenkins Mentor Graphics -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/17/2013 12:08 PM, Mugunthan V N wrote: > Mine shows [ 0.000000] AM335X ES2.0 (neon ) > > In Beagle bone (silicon revision 1.0) there is a bug in CPSW irq in > Silicon, please refer > http://www.ti.com/lit/er/sprz360e/sprz360e.pdf Advisory 1.0.9 > > Beagle bone black has Silicon revision 2.0 where the bug is fixed and > you are able > to test it properly and it hangs in my bone black as the IRQ is properly > connected > to A8 Are you there is not something else going on? According to the TRM 14.3.1.3.1 + .14.3.1.3.2 the interrupts are properly disabled. So according to the TRM, the interrupts should not be disabled and not appear anymore. Advisory 1.0.9 says that only TXPEND[0] and RXPEND[0] are connected to the INTC rather than TXPEND[0-7] (+RXPEND) and therefore only channel 0 can be used. That means I have still no idea why the interrupt is not cleared on your ES2.0. > Regards > Mugunthan V N Sebastian -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c index e2ba702..acb229c 100644 --- a/drivers/net/ethernet/ti/cpsw.c +++ b/drivers/net/ethernet/ti/cpsw.c @@ -133,19 +133,6 @@ do { \ #define CPSW_CMINTMAX_INTVL (1000 / CPSW_CMINTMIN_CNT) #define CPSW_CMINTMIN_INTVL ((1000 / CPSW_CMINTMAX_CNT) + 1) -#define cpsw_enable_irq(priv) \ - do { \ - u32 i; \ - for (i = 0; i < priv->num_irqs; i++) \ - enable_irq(priv->irqs_table[i]); \ - } while (0); -#define cpsw_disable_irq(priv) \ - do { \ - u32 i; \ - for (i = 0; i < priv->num_irqs; i++) \ - disable_irq_nosync(priv->irqs_table[i]); \ - } while (0); - #define cpsw_slave_index(priv) \ ((priv->data.dual_emac) ? priv->emac_port : \ priv->data.active_slave) @@ -513,13 +500,11 @@ static irqreturn_t cpsw_interrupt(int irq, void *dev_id) if (likely(netif_running(priv->ndev))) { cpsw_intr_disable(priv); - cpsw_disable_irq(priv); napi_schedule(&priv->napi); } else { priv = cpsw_get_slave_priv(priv, 1); if (likely(priv) && likely(netif_running(priv->ndev))) { cpsw_intr_disable(priv); - cpsw_disable_irq(priv); napi_schedule(&priv->napi); } } @@ -540,7 +525,6 @@ static int cpsw_poll(struct napi_struct *napi, int budget) napi_complete(napi); cpsw_intr_enable(priv); cpdma_ctlr_eoi(priv->dma, CPDMA_EOI_RX); - cpsw_enable_irq(priv); } if (num_rx || num_tx)
This driver has usually four interrupts registered with the same ISR. The ISR masks the interrupt source in DMA engine and CPSW, disables the interrupt and schedules NAPI. After the NAPI routine completes, the source is activated and interrupts are re-enabled. With threaded-interrupts enabled, the enable/disable might go wrong. After the RX interrupt arrived, the core marks the interrupt pending. If the TX interrupt arrives before the RX started, then both lines are marked pending and both disable the interrupt which then is disabled twice. In NAPI complete the interrupt is enabled only once which means the four interrupts are still masked and the device plays dead. While playing with this on my beagle bone I didn't understand why the interrupts are deactivated in the first place. According to my testing, after calling cpsw_intr_disable() the interrupt is not active anymore. I haven't seen the ISR being invoked again until after cpsw_poll() enabled them (except for a different interrupt number). Therefore I remove this. In my testing I didn't notice anything unusual except one thing: I start a wget of a 126MiB file (which is sttored on MMC) from the beagle bone. On the first invocation I receive almost steady 5MiB/sec. In the following invocations of wget I see the performance floating between 1MiB/sec and 4MiB/sec. It usually ends with overall around 2MiB/sec. I couldn't notice nothing wrong. The only difference compared to the first invocation is (most likely) that the file is now served from memory and not read from MMC which should perform better but not worse. After executing "while ((1)); do /bin/true; done" on the beagle bone while the file was downloaded, I noticed that the speed rose to 9MiB/sec - 10/MiB/sec. Now this looks like power/idle optimization on the beagle bone side. Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- drivers/net/ethernet/ti/cpsw.c | 16 ---------------- 1 file changed, 16 deletions(-)