Patchwork can: flexcan: fix irq flooding by clearing all interrupt sources

login
register
mail settings
Submitter Wolfgang Grandegger
Date Dec. 12, 2011, 3:09 p.m.
Message ID <4EE61928.10608@grandegger.com>
Download mbox | patch
Permalink /patch/130761/
State Awaiting Upstream
Delegated to: David Miller
Headers show

Comments

Wolfgang Grandegger - Dec. 12, 2011, 3:09 p.m.
As pointed out by Reuben Dowle and Lothar Waßmann, the TWRN_INT,
RWRN_INT, BOFF_INT interrupt sources need to be cleared as well
to avoid interrupt flooding, at least for the Flexcan on i.MX28
SOCs. Furthermore, the interrupts are only cleared, if really one
of those interrupt sources are pending (which is not the case for
rx and tx done).

CC: Reuben Dowle <Reuben.Dowle@navico.com>
CC: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
---
 drivers/net/can/flexcan.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)
Marc Kleine-Budde - Dec. 12, 2011, 3:31 p.m.
On 12/12/2011 04:09 PM, Wolfgang Grandegger wrote:
> As pointed out by Reuben Dowle and Lothar Waßmann, the TWRN_INT,
> RWRN_INT, BOFF_INT interrupt sources need to be cleared as well
> to avoid interrupt flooding, at least for the Flexcan on i.MX28
> SOCs. Furthermore, the interrupts are only cleared, if really one
> of those interrupt sources are pending (which is not the case for
> rx and tx done).
> 
> CC: Reuben Dowle <Reuben.Dowle@navico.com>
> CC: Lothar Waßmann <LW@KARO-electronics.de>
> Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>

Have you tested on mx25/mx35, does it have any negative side effects?
My schedule is full until Friday, sorry cannot test here.

Marc

> ---
>  drivers/net/can/flexcan.c |    7 ++++++-
>  1 files changed, 6 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c
> index 165a4c7..111f154 100644
> --- a/drivers/net/can/flexcan.c
> +++ b/drivers/net/can/flexcan.c
> @@ -118,6 +118,9 @@
>  	(FLEXCAN_ESR_TWRN_INT | FLEXCAN_ESR_RWRN_INT | FLEXCAN_ESR_BOFF_INT)
>  #define FLEXCAN_ESR_ERR_ALL \
>  	(FLEXCAN_ESR_ERR_BUS | FLEXCAN_ESR_ERR_STATE)
> +#define FLEXCAN_ESR_ALL_INT \
> +	(FLEXCAN_ESR_TWRN_INT | FLEXCAN_ESR_RWRN_INT | \
> +	 FLEXCAN_ESR_BOFF_INT | FLEXCAN_ESR_ERR_INT)
>  
>  /* FLEXCAN interrupt flag register (IFLAG) bits */
>  #define FLEXCAN_TX_BUF_ID		8
> @@ -577,7 +580,9 @@ static irqreturn_t flexcan_irq(int irq, void *dev_id)
>  
>  	reg_iflag1 = flexcan_read(&regs->iflag1);
>  	reg_esr = flexcan_read(&regs->esr);
> -	flexcan_write(FLEXCAN_ESR_ERR_INT, &regs->esr);	/* ACK err IRQ */
> +	/* ACK all bus error and state change IRQ sources */
> +	if (reg_esr & FLEXCAN_ESR_ALL_INT)
> +		flexcan_write(reg_esr & FLEXCAN_ESR_ALL_INT, &regs->esr);
>  
>  	/*
>  	 * schedule NAPI in case of:
Wolfgang Grandegger - Dec. 12, 2011, 3:44 p.m.
On 12/12/2011 04:31 PM, Marc Kleine-Budde wrote:
> On 12/12/2011 04:09 PM, Wolfgang Grandegger wrote:
>> As pointed out by Reuben Dowle and Lothar Waßmann, the TWRN_INT,
>> RWRN_INT, BOFF_INT interrupt sources need to be cleared as well
>> to avoid interrupt flooding, at least for the Flexcan on i.MX28
>> SOCs. Furthermore, the interrupts are only cleared, if really one
>> of those interrupt sources are pending (which is not the case for
>> rx and tx done).
>>
>> CC: Reuben Dowle <Reuben.Dowle@navico.com>
>> CC: Lothar Waßmann <LW@KARO-electronics.de>
>> Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
> 
> Have you tested on mx25/mx35, does it have any negative side effects?
> My schedule is full until Friday, sorry cannot test here.

Not yet. But it's not critical. Only the pending interrupt flags are
cleared. Maybe somebody else out there could do some testing... before I
get hold of a MX35PDK board.

Wolfgang.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wolfgang Grandegger - Dec. 13, 2011, 12:13 p.m.
On 12/12/2011 04:44 PM, Wolfgang Grandegger wrote:
> On 12/12/2011 04:31 PM, Marc Kleine-Budde wrote:
>> On 12/12/2011 04:09 PM, Wolfgang Grandegger wrote:
>>> As pointed out by Reuben Dowle and Lothar Waßmann, the TWRN_INT,
>>> RWRN_INT, BOFF_INT interrupt sources need to be cleared as well
>>> to avoid interrupt flooding, at least for the Flexcan on i.MX28
>>> SOCs. Furthermore, the interrupts are only cleared, if really one
>>> of those interrupt sources are pending (which is not the case for
>>> rx and tx done).
>>>
>>> CC: Reuben Dowle <Reuben.Dowle@navico.com>
>>> CC: Lothar Waßmann <LW@KARO-electronics.de>
>>> Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
>>
>> Have you tested on mx25/mx35, does it have any negative side effects?
>> My schedule is full until Friday, sorry cannot test here.
> 
> Not yet. But it's not critical. Only the pending interrupt flags are
> cleared. Maybe somebody else out there could do some testing... before I
> get hold of a MX35PDK board.

I got my MX35PDK board working and can confirm, that the patch works on
a i.mx35 as well. My testing also confirms, that the ESR TWRN_INT,
RWRN_INT, FLEXCAN_ESR_BOFF_INT do not function as documented. These
flags do show up once, together with ERR_INT, and then, after clearing,
never again. Obviously a bug in the Flexcan logic. From the feedback we
can say, that only the i.MX28 does behave differently (==correctly). All
other seem to work with the current code:

 Flexcan on
 - i.mx25
 - i.mx35
 - i.mx53
 - P1010/P1020

Wolfgang.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marc Kleine-Budde - Dec. 13, 2011, 12:53 p.m.
On 12/13/2011 01:13 PM, Wolfgang Grandegger wrote:
> On 12/12/2011 04:44 PM, Wolfgang Grandegger wrote:
>> On 12/12/2011 04:31 PM, Marc Kleine-Budde wrote:
>>> On 12/12/2011 04:09 PM, Wolfgang Grandegger wrote:
>>>> As pointed out by Reuben Dowle and Lothar Waßmann, the TWRN_INT,
>>>> RWRN_INT, BOFF_INT interrupt sources need to be cleared as well
>>>> to avoid interrupt flooding, at least for the Flexcan on i.MX28
>>>> SOCs. Furthermore, the interrupts are only cleared, if really one
>>>> of those interrupt sources are pending (which is not the case for
>>>> rx and tx done).
>>>>
>>>> CC: Reuben Dowle <Reuben.Dowle@navico.com>
>>>> CC: Lothar Waßmann <LW@KARO-electronics.de>
>>>> Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
>>>
>>> Have you tested on mx25/mx35, does it have any negative side effects?
>>> My schedule is full until Friday, sorry cannot test here.
>>
>> Not yet. But it's not critical. Only the pending interrupt flags are
>> cleared. Maybe somebody else out there could do some testing... before I
>> get hold of a MX35PDK board.
> 
> I got my MX35PDK board working and can confirm, that the patch works on
> a i.mx35 as well. My testing also confirms, that the ESR TWRN_INT,
> RWRN_INT, FLEXCAN_ESR_BOFF_INT do not function as documented. These
> flags do show up once, together with ERR_INT, and then, after clearing,
> never again. Obviously a bug in the Flexcan logic. From the feedback we
> can say, that only the i.MX28 does behave differently (==correctly). All
> other seem to work with the current code:
> 
>  Flexcan on
>  - i.mx25
>  - i.mx35
>  - i.mx53
>  - P1010/P1020

I'm adding the patch to linux-can. I think this is a stable candidate
for v2.6.39 and newer (mx28 suport was added in 2.6.39).

Marc
Wolfgang Grandegger - Dec. 13, 2011, 12:53 p.m.
On 12/13/2011 01:13 PM, Wolfgang Grandegger wrote:
> On 12/12/2011 04:44 PM, Wolfgang Grandegger wrote:
>> On 12/12/2011 04:31 PM, Marc Kleine-Budde wrote:
>>> On 12/12/2011 04:09 PM, Wolfgang Grandegger wrote:
>>>> As pointed out by Reuben Dowle and Lothar Waßmann, the TWRN_INT,
>>>> RWRN_INT, BOFF_INT interrupt sources need to be cleared as well
>>>> to avoid interrupt flooding, at least for the Flexcan on i.MX28
>>>> SOCs. Furthermore, the interrupts are only cleared, if really one
>>>> of those interrupt sources are pending (which is not the case for
>>>> rx and tx done).
>>>>
>>>> CC: Reuben Dowle <Reuben.Dowle@navico.com>
>>>> CC: Lothar Waßmann <LW@KARO-electronics.de>
>>>> Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
>>>
>>> Have you tested on mx25/mx35, does it have any negative side effects?
>>> My schedule is full until Friday, sorry cannot test here.
>>
>> Not yet. But it's not critical. Only the pending interrupt flags are
>> cleared. Maybe somebody else out there could do some testing... before I
>> get hold of a MX35PDK board.
> 
> I got my MX35PDK board working and can confirm, that the patch works on
> a i.mx35 as well. My testing also confirms, that the ESR TWRN_INT,
> RWRN_INT, FLEXCAN_ESR_BOFF_INT do not function as documented. These
> flags do show up once, together with ERR_INT, and then, after clearing,
> never again. Obviously a bug in the Flexcan logic. From the feedback we
> can say, that only the i.MX28 does behave differently (==correctly). All
> other seem to work with the current code:
> 
>  Flexcan on
>  - i.mx25
>  - i.mx35
>  - i.mx53
>  - P1010/P1020

But unfortunately, state change reporting looks different with this patch :)

Wolfgang.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marc Kleine-Budde - Dec. 13, 2011, 12:59 p.m.
On 12/13/2011 01:53 PM, Wolfgang Grandegger wrote:
>> I got my MX35PDK board working and can confirm, that the patch works on
>> a i.mx35 as well. My testing also confirms, that the ESR TWRN_INT,
>> RWRN_INT, FLEXCAN_ESR_BOFF_INT do not function as documented. These
>> flags do show up once, together with ERR_INT, and then, after clearing,
>> never again. Obviously a bug in the Flexcan logic. From the feedback we
>> can say, that only the i.MX28 does behave differently (==correctly). All
>> other seem to work with the current code:
>>
>>  Flexcan on
>>  - i.mx25
>>  - i.mx35
>>  - i.mx53
>>  - P1010/P1020
> 
> But unfortunately, state change reporting looks different with this patch :)

Hmm - so not schedule for stable. What about your buf-off-handling, will
this change the reporting again?

Marc
Wolfgang Grandegger - Dec. 13, 2011, 4:22 p.m.
On 12/13/2011 01:59 PM, Marc Kleine-Budde wrote:
> On 12/13/2011 01:53 PM, Wolfgang Grandegger wrote:
>>> I got my MX35PDK board working and can confirm, that the patch works on
>>> a i.mx35 as well. My testing also confirms, that the ESR TWRN_INT,
>>> RWRN_INT, FLEXCAN_ESR_BOFF_INT do not function as documented. These
>>> flags do show up once, together with ERR_INT, and then, after clearing,
>>> never again. Obviously a bug in the Flexcan logic. From the feedback we
>>> can say, that only the i.MX28 does behave differently (==correctly). All
>>> other seem to work with the current code:
>>>
>>>  Flexcan on
>>>  - i.mx25
>>>  - i.mx35
>>>  - i.mx53
>>>  - P1010/P1020
>>
>> But unfortunately, state change reporting looks different with this patch :)
> 
> Hmm - so not schedule for stable. What about your buf-off-handling, will
> this change the reporting again?

Well, as it is a serious problem on i.MX28, I would schedule this patch
for stable as well. The error and state change reporting is bogus on the
Flexcan anyhow. Without this patch, I get "active->warning->passive" if
I send a message with cable disconnect (no ack). With patch just
"active->warning". That's the same behaviour
as on the i.MX28, also with my new state and bus-off handling. See:

https://gitorious.org/~wgrandegger/linux-can/wg-linux-can-next/commit/bd3acb12dbb9551541d28ae8766c154d3cf6ed57

Wolfgang.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wolfgang Grandegger - Dec. 14, 2011, 1:21 p.m.
Hi Marc,

On 12/13/2011 05:22 PM, Wolfgang Grandegger wrote:
> On 12/13/2011 01:59 PM, Marc Kleine-Budde wrote:
>> On 12/13/2011 01:53 PM, Wolfgang Grandegger wrote:
>>>> I got my MX35PDK board working and can confirm, that the patch works on
>>>> a i.mx35 as well. My testing also confirms, that the ESR TWRN_INT,
>>>> RWRN_INT, FLEXCAN_ESR_BOFF_INT do not function as documented. These
>>>> flags do show up once, together with ERR_INT, and then, after clearing,
>>>> never again. Obviously a bug in the Flexcan logic. From the feedback we
>>>> can say, that only the i.MX28 does behave differently (==correctly). All
>>>> other seem to work with the current code:
>>>>
>>>>  Flexcan on
>>>>  - i.mx25
>>>>  - i.mx35
>>>>  - i.mx53
>>>>  - P1010/P1020
>>>
>>> But unfortunately, state change reporting looks different with this patch :)
>>
>> Hmm - so not schedule for stable. What about your buf-off-handling, will
>> this change the reporting again?
> 
> Well, as it is a serious problem on i.MX28, I would schedule this patch
> for stable as well. The error and state change reporting is bogus on the
> Flexcan anyhow. Without this patch, I get "active->warning->passive" if
> I send a message with cable disconnect (no ack). With patch just
> "active->warning". That's the same behaviour
> as on the i.MX28, also with my new state and bus-off handling. See:

I now understand the difference. I "flexcan_irq" we have:

	/*
	 * schedule NAPI in case of:
	 * - rx IRQ
	 * - state change IRQ
	 * - bus error IRQ and bus error reporting is activated
	 */
	if ((reg_iflag1 & FLEXCAN_IFLAG_RX_FIFO_AVAILABLE) ||
	    (reg_esr & FLEXCAN_ESR_ERR_STATE) ||
	    flexcan_has_and_handle_berr(priv, reg_esr)) {

Without this patch, "reg_esr & FLEXCAN_ESR_ERR_STATE" is *always* true
because the RWRN_INT is never cleared. This means, *any* message is
scheduled. As a nice side effect, this patch fixes this bug a well.

Only because any message is scheduled, the state change to error passive
is recognized in flexcan_poll(). The state change to error passive is
not signaled by an extra interrupt and is therefore only visible
together with RX, TX done or bus-error events.

Wolfgang.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c
index 165a4c7..111f154 100644
--- a/drivers/net/can/flexcan.c
+++ b/drivers/net/can/flexcan.c
@@ -118,6 +118,9 @@ 
 	(FLEXCAN_ESR_TWRN_INT | FLEXCAN_ESR_RWRN_INT | FLEXCAN_ESR_BOFF_INT)
 #define FLEXCAN_ESR_ERR_ALL \
 	(FLEXCAN_ESR_ERR_BUS | FLEXCAN_ESR_ERR_STATE)
+#define FLEXCAN_ESR_ALL_INT \
+	(FLEXCAN_ESR_TWRN_INT | FLEXCAN_ESR_RWRN_INT | \
+	 FLEXCAN_ESR_BOFF_INT | FLEXCAN_ESR_ERR_INT)
 
 /* FLEXCAN interrupt flag register (IFLAG) bits */
 #define FLEXCAN_TX_BUF_ID		8
@@ -577,7 +580,9 @@  static irqreturn_t flexcan_irq(int irq, void *dev_id)
 
 	reg_iflag1 = flexcan_read(&regs->iflag1);
 	reg_esr = flexcan_read(&regs->esr);
-	flexcan_write(FLEXCAN_ESR_ERR_INT, &regs->esr);	/* ACK err IRQ */
+	/* ACK all bus error and state change IRQ sources */
+	if (reg_esr & FLEXCAN_ESR_ALL_INT)
+		flexcan_write(reg_esr & FLEXCAN_ESR_ALL_INT, &regs->esr);
 
 	/*
 	 * schedule NAPI in case of: