diff mbox

IRQ #0 broken on ARM

Message ID CALT56yPs-SSssff1waj81CRQfER4ZKKR=B1jOXVL-aBChicSjg@mail.gmail.com
State New
Headers show

Commit Message

Dmitry Baryshkov Nov. 21, 2014, 10:31 a.m. UTC
Hello,

After the commit a71b092a9c68685a270ebdde7b5986ba8787e575
(ARM: Convert handle_IRQ to use __handle_domain_irq) IRQ #0 is broken
on ARM. It is a valid IRQ and it is quite imporant (on sa1100 it's a GPIO0).

The worst thing is that the CPU will be stuck busy-looping around this IRQ
w/o printing anything to the console or masking the irq. How should we cope
with that? I'd like to propose to either revert the offending commit or to add
the following patch.

Comments

Russell King - ARM Linux Nov. 21, 2014, 10:34 a.m. UTC | #1
On Fri, Nov 21, 2014 at 02:31:05PM +0400, Dmitry Eremin-Solenikov wrote:
> Hello,
> 
> After the commit a71b092a9c68685a270ebdde7b5986ba8787e575
> (ARM: Convert handle_IRQ to use __handle_domain_irq) IRQ #0 is broken
> on ARM. It is a valid IRQ and it is quite imporant (on sa1100 it's a GPIO0).

No, it is not a valid IRQ.  (It was a mistake to think it was.)
Generic code will always assume IRQ0 is not valid, and the fix is to
fix the places in ARM where we try to use it.
Dmitry Baryshkov Nov. 21, 2014, 10:51 a.m. UTC | #2
2014-11-21 13:34 GMT+03:00 Russell King - ARM Linux <linux@arm.linux.org.uk>:
> On Fri, Nov 21, 2014 at 02:31:05PM +0400, Dmitry Eremin-Solenikov wrote:
>> Hello,
>>
>> After the commit a71b092a9c68685a270ebdde7b5986ba8787e575
>> (ARM: Convert handle_IRQ to use __handle_domain_irq) IRQ #0 is broken
>> on ARM. It is a valid IRQ and it is quite imporant (on sa1100 it's a GPIO0).
>
> No, it is not a valid IRQ.  (It was a mistake to think it was.)
> Generic code will always assume IRQ0 is not valid, and the fix is to
> fix the places in ARM where we try to use it.

Ok. SA1100, PXA, ebsa110, footbridge, rpc, orion5x, mv78xx0, ixp4xx,
lpc32xx, ks8695 and several other aging not so aging platforms are using IRQ0.

Breaking them in a very strange and silent manner doesn't look like a good
behaviour, does it? What would be a proposed fix?
Uwe Kleine-König Nov. 21, 2014, 10:53 a.m. UTC | #3
On Fri, Nov 21, 2014 at 10:34:56AM +0000, Russell King - ARM Linux wrote:
> On Fri, Nov 21, 2014 at 02:31:05PM +0400, Dmitry Eremin-Solenikov wrote:
> > Hello,
> > 
> > After the commit a71b092a9c68685a270ebdde7b5986ba8787e575
> > (ARM: Convert handle_IRQ to use __handle_domain_irq) IRQ #0 is broken
> > on ARM. It is a valid IRQ and it is quite imporant (on sa1100 it's a GPIO0).
> 
> No, it is not a valid IRQ.  (It was a mistake to think it was.)
> Generic code will always assume IRQ0 is not valid, and the fix is to
> fix the places in ARM where we try to use it.
To make this more understandable: Linux uses virtual irq numbers. The
virtual irq 0 is invalid. For a given irq domain the (hardware) irq 0 is
of course useful and can be supported. Still for a device driver (which
uses the virtual irq space) 0 should always be invalid.

Best regards
Uwe
Dmitry Baryshkov Nov. 21, 2014, 10:55 a.m. UTC | #4
2014-11-21 13:53 GMT+03:00 Uwe Kleine-König <u.kleine-koenig@pengutronix.de>:
> On Fri, Nov 21, 2014 at 10:34:56AM +0000, Russell King - ARM Linux wrote:
>> On Fri, Nov 21, 2014 at 02:31:05PM +0400, Dmitry Eremin-Solenikov wrote:
>> > Hello,
>> >
>> > After the commit a71b092a9c68685a270ebdde7b5986ba8787e575
>> > (ARM: Convert handle_IRQ to use __handle_domain_irq) IRQ #0 is broken
>> > on ARM. It is a valid IRQ and it is quite imporant (on sa1100 it's a GPIO0).
>>
>> No, it is not a valid IRQ.  (It was a mistake to think it was.)
>> Generic code will always assume IRQ0 is not valid, and the fix is to
>> fix the places in ARM where we try to use it.
> To make this more understandable: Linux uses virtual irq numbers. The
> virtual irq 0 is invalid. For a given irq domain the (hardware) irq 0 is
> of course useful and can be supported. Still for a device driver (which
> uses the virtual irq space) 0 should always be invalid.

I was talking about virtual irq space, not hwirq.
Dmitry Baryshkov Nov. 21, 2014, 11:01 a.m. UTC | #5
2014-11-21 13:52 GMT+03:00 Marc Zyngier <marc.zyngier@arm.com>:
> On Fri, Nov 21 2014 at 10:31:05 am GMT, Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> wrote:
>> Hello,
>>
>> After the commit a71b092a9c68685a270ebdde7b5986ba8787e575
>> (ARM: Convert handle_IRQ to use __handle_domain_irq) IRQ #0 is broken
>> on ARM. It is a valid IRQ and it is quite imporant (on sa1100 it's a GPIO0).
>
> Well, this is a valid IRQ number if you're not using irq domains. I may
> be a bit pedantic here, but I thing this is an important distinction.
>
>> The worst thing is that the CPU will be stuck busy-looping around this
>> IRQ w/o printing anything to the console or masking the irq. How
>> should we cope with that? I'd like to propose to either revert the
>> offending commit or to add the following patch.
>
> Well, said commit fixes a rather important bug, so I suggest we keep
> around. Now, as for your suggestion:

>From the commit message it was not clear, that there was a bug fixed
(it talks only about code duplication).

[skipped]

> As I mentioned above, IRQ0 is not valid when using irq domains. As an
> alternative, how about this:
>
> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index a1782f8..9f5bc92 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -365,7 +365,7 @@ int __handle_domain_irq(struct irq_domain *domain, unsigned int hwirq,
>          * Some hardware gives randomly wrong interrupts.  Rather
>          * than crashing, do something sensible.
>          */
> -       if (unlikely(!irq || irq >= nr_irqs)) {
> +       if (unlikely((lookup && !irq) || irq >= nr_irqs)) {
>                 ack_bad_irq(irq);
>                 ret = -EINVAL;
>         } else {
>
> I don't have a platform to test this on, but maybe you could give it a
> go and let me know if that helps?

It helps in my case. Thank you. Please add me to Cc if you submit
this patch.
Russell King - ARM Linux Nov. 21, 2014, 11:01 a.m. UTC | #6
On Fri, Nov 21, 2014 at 10:52:37AM +0000, Marc Zyngier wrote:
> On Fri, Nov 21 2014 at 10:31:05 am GMT, Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> wrote:
> > Hello,
> >
> > After the commit a71b092a9c68685a270ebdde7b5986ba8787e575
> > (ARM: Convert handle_IRQ to use __handle_domain_irq) IRQ #0 is broken
> > on ARM. It is a valid IRQ and it is quite imporant (on sa1100 it's a GPIO0).
> 
> Well, this is a valid IRQ number if you're not using irq domains. I may
> be a bit pedantic here, but I thing this is an important distinction.

Linus has decreed it to not be a valid IRQ number, and that's basically
the end of the discussion.  Generic code, and drivers, will increasingly
decide that IRQ0 is not valid, and objecting to it has, and will continue
to elicit a response of "fix ARM".
Marc Zyngier Nov. 21, 2014, 11:17 a.m. UTC | #7
Hi Russell,

On 21/11/14 11:01, Russell King - ARM Linux wrote:
> On Fri, Nov 21, 2014 at 10:52:37AM +0000, Marc Zyngier wrote:
>> On Fri, Nov 21 2014 at 10:31:05 am GMT, Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> wrote:
>>> Hello,
>>>
>>> After the commit a71b092a9c68685a270ebdde7b5986ba8787e575
>>> (ARM: Convert handle_IRQ to use __handle_domain_irq) IRQ #0 is broken
>>> on ARM. It is a valid IRQ and it is quite imporant (on sa1100 it's a GPIO0).
>>
>> Well, this is a valid IRQ number if you're not using irq domains. I may
>> be a bit pedantic here, but I thing this is an important distinction.
> 
> Linus has decreed it to not be a valid IRQ number, and that's basically
> the end of the discussion.  Generic code, and drivers, will increasingly
> decide that IRQ0 is not valid, and objecting to it has, and will continue
> to elicit a response of "fix ARM".

I'm fine with that.

Thanks,

	M.
Dmitry Baryshkov Nov. 21, 2014, 10:20 p.m. UTC | #8
2014-11-22 0:32 GMT+03:00 Robert Jarzmik <robert.jarzmik@free.fr>:
> Marc Zyngier <marc.zyngier@arm.com> writes:
>>> Linus has decreed it to not be a valid IRQ number, and that's basically
>>> the end of the discussion.  Generic code, and drivers, will increasingly
>>> decide that IRQ0 is not valid, and objecting to it has, and will continue
>>> to elicit a response of "fix ARM".
>>
>> I'm fine with that.
> For pxa, why not do something like that [1] ?
>
> Cheers.
>
> --
> Robert
>
> [1]
> ---8>---
>
> From 551eaf75934bd84939a40781470ed3c04d17507a Mon Sep 17 00:00:00 2001
> From: Robert Jarzmik <robert.jarzmik@free.fr>
> Date: Fri, 21 Nov 2014 22:11:42 +0100
> Subject: [PATCH] ARM: pxa: arbitrarily set first interrupt number
>
> As IRQ0, the legacy timer interrupt should not be used as an interrupt
> number, shift the interrupts by a fixed number.
>
> As we had in a special case a shift of 16 when ISA bus was used on a
> PXA, use that value as the first interrupt number, regardless of ISA or
> not.

This will shift the issue from PXA_SSP3 interrupt to ISA IRQ0.
On the other hand ISA IRQs are used only on viper and zeus boards.
And those boards explicitly mark IRQ0 as unused (because it is not
routed through CPLD).

I have another question. As for me, those "ISA" interrupts being placed
in front of PXA interrupts look like some kind of legacy stuff. Do we
still require for "ISA" (well, PC/104) interrupts to be the first ones?
Rob Herring Nov. 21, 2014, 10:27 p.m. UTC | #9
On Fri, Nov 21, 2014 at 3:32 PM, Robert Jarzmik <robert.jarzmik@free.fr> wrote:
> Marc Zyngier <marc.zyngier@arm.com> writes:
>>> Linus has decreed it to not be a valid IRQ number, and that's basically
>>> the end of the discussion.  Generic code, and drivers, will increasingly
>>> decide that IRQ0 is not valid, and objecting to it has, and will continue
>>> to elicit a response of "fix ARM".
>>
>> I'm fine with that.
> For pxa, why not do something like that [1] ?
>
> Cheers.
>
> --
> Robert
>
> [1]
> ---8>---
>
> From 551eaf75934bd84939a40781470ed3c04d17507a Mon Sep 17 00:00:00 2001
> From: Robert Jarzmik <robert.jarzmik@free.fr>
> Date: Fri, 21 Nov 2014 22:11:42 +0100
> Subject: [PATCH] ARM: pxa: arbitrarily set first interrupt number
>
> As IRQ0, the legacy timer interrupt should not be used as an interrupt
> number, shift the interrupts by a fixed number.
>
> As we had in a special case a shift of 16 when ISA bus was used on a
> PXA, use that value as the first interrupt number, regardless of ISA or
> not.
>
> Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
> ---
>  arch/arm/mach-pxa/include/mach/irqs.h | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/arch/arm/mach-pxa/include/mach/irqs.h b/arch/arm/mach-pxa/include/mach/irqs.h
> index 48c2fd8..9d8983f 100644
> --- a/arch/arm/mach-pxa/include/mach/irqs.h
> +++ b/arch/arm/mach-pxa/include/mach/irqs.h
> @@ -14,12 +14,9 @@
>
>  #ifdef CONFIG_PXA_HAVE_ISA_IRQS

You can get rid of this ifdef and kconfig symbol. It is only used here.

>  #define PXA_ISA_IRQ(x) (x)
> -#define PXA_ISA_IRQ_NUM        (16)
> -#else
> -#define PXA_ISA_IRQ_NUM        (0)
>  #endif
>
> -#define PXA_IRQ(x)     (PXA_ISA_IRQ_NUM + (x))
> +#define PXA_IRQ(x)     (16 + (x))

Perhaps use NR_IRQS_LEGACY here.

>  #define IRQ_SSP3       PXA_IRQ(0)      /* SSP3 service request */
>  #define IRQ_MSL                PXA_IRQ(1)      /* MSL Interface interrupt */
> --
> 2.1.0
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Grant Likely Nov. 21, 2014, 10:31 p.m. UTC | #10
On Fri, Nov 21, 2014 at 10:52 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On Fri, Nov 21 2014 at 10:31:05 am GMT, Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> wrote:
>> Hello,
>>
>> After the commit a71b092a9c68685a270ebdde7b5986ba8787e575
>> (ARM: Convert handle_IRQ to use __handle_domain_irq) IRQ #0 is broken
>> on ARM. It is a valid IRQ and it is quite imporant (on sa1100 it's a GPIO0).
>
> Well, this is a valid IRQ number if you're not using irq domains. I may
> be a bit pedantic here, but I thing this is an important distinction.
>
>> The worst thing is that the CPU will be stuck busy-looping around this
>> IRQ w/o printing anything to the console or masking the irq. How
>> should we cope with that? I'd like to propose to either revert the
>> offending commit or to add the following patch.
>
> Well, said commit fixes a rather important bug, so I suggest we keep
> around. Now, as for your suggestion:
>
>> --
>> With best wishes
>> Dmitry
>>
>> From e87f86497b796ed55fff644bbc75bf1890941829 Mon Sep 17 00:00:00 2001
>> From: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
>> Date: Fri, 21 Nov 2014 13:27:11 +0300
>> Subject: [PATCH] genirq: handle IRQ 0 in __handle_domain_irq
>>
>> __handle_domain_irq() function will ignore (well, report as bad) the IRQ
>> number 0. On some platforms IRQ0 is bad IRQ. On others it is not. And
>> while platforms are still in the process of converging to not using
>> IRQ number 0 as a valid IRQ, I'd like to propose to use IRQ0 as a valid
>> one in __handle_domain_irq().
>>
>> Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
>> ---
>>  kernel/irq/irqdesc.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
>> index a1782f8..bfbeeb6 100644
>> --- a/kernel/irq/irqdesc.c
>> +++ b/kernel/irq/irqdesc.c
>> @@ -365,7 +365,7 @@ int __handle_domain_irq(struct irq_domain *domain, unsigned int hwirq,
>>        * Some hardware gives randomly wrong interrupts.  Rather
>>        * than crashing, do something sensible.
>>        */
>> -     if (unlikely(!irq || irq >= nr_irqs)) {
>> +     if (unlikely(irq >= nr_irqs)) {
>>               ack_bad_irq(irq);
>>               ret = -EINVAL;
>>       } else {
>
> As I mentioned above, IRQ0 is not valid when using irq domains. As an
> alternative, how about this:
>
> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index a1782f8..9f5bc92 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -365,7 +365,7 @@ int __handle_domain_irq(struct irq_domain *domain, unsigned int hwirq,
>          * Some hardware gives randomly wrong interrupts.  Rather
>          * than crashing, do something sensible.
>          */
> -       if (unlikely(!irq || irq >= nr_irqs)) {
> +       if (unlikely((lookup && !irq) || irq >= nr_irqs)) {
>                 ack_bad_irq(irq);
>                 ret = -EINVAL;
>         } else {
>
> I don't have a platform to test this on, but maybe you could give it a
> go and let me know if that helps?

I have to nak this. It isn't just when using domains that virq 0 is
invalid. It is always invalid. As suggested elsewhere in this thread,
platform code should use a hard offset from the hwirq to the virq to
get off of irq0.

g.
Robert Jarzmik Nov. 22, 2014, 12:18 p.m. UTC | #11
Rob Herring <robherring2@gmail.com> writes:

>>  #ifdef CONFIG_PXA_HAVE_ISA_IRQS
>
> You can get rid of this ifdef and kconfig symbol. It is only used here.
Right.

>> -#define PXA_IRQ(x)     (PXA_ISA_IRQ_NUM + (x))
>> +#define PXA_IRQ(x)     (16 + (x))
>
> Perhaps use NR_IRQS_LEGACY here.
Ah yes, good idea.

I'll sent a proper patch next time, not an attached piece of code.

Cheers.
Dmitry Baryshkov Nov. 22, 2014, 12:40 p.m. UTC | #12
2014-11-22 15:18 GMT+03:00 Robert Jarzmik <robert.jarzmik@free.fr>:
> Rob Herring <robherring2@gmail.com> writes:
>
>>>  #ifdef CONFIG_PXA_HAVE_ISA_IRQS
>>
>> You can get rid of this ifdef and kconfig symbol. It is only used here.
> Right.
>
>>> -#define PXA_IRQ(x)     (PXA_ISA_IRQ_NUM + (x))
>>> +#define PXA_IRQ(x)     (16 + (x))
>>
>> Perhaps use NR_IRQS_LEGACY here.
> Ah yes, good idea.

What about using NUM_ISA_INTERRUPTS? This would be logical
if viper & zeus were converted to call irq_domain_add_legacy_isa.
Robert Jarzmik Nov. 22, 2014, 12:55 p.m. UTC | #13
Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> writes:

>>> Perhaps use NR_IRQS_LEGACY here.
>> Ah yes, good idea.
>
> What about using NUM_ISA_INTERRUPTS? This would be logical
> if viper & zeus were converted to call irq_domain_add_legacy_isa.
That would make include/mach/irqs.h depend on include/linux/irqdomain.h.

That's something that makes me fell uneasy, as this irqs.h could be very well
used in assembler files, and I must check it.

On the other hand, NR_IRQS_LEGACY is defined in include/asm/irq.h, and that one
looks a better candidate for inclusion in mach/irqs.h.

Cheers.
diff mbox

Patch

From e87f86497b796ed55fff644bbc75bf1890941829 Mon Sep 17 00:00:00 2001
From: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Date: Fri, 21 Nov 2014 13:27:11 +0300
Subject: [PATCH] genirq: handle IRQ 0 in __handle_domain_irq

__handle_domain_irq() function will ignore (well, report as bad) the IRQ
number 0. On some platforms IRQ0 is bad IRQ. On others it is not. And
while platforms are still in the process of converging to not using
IRQ number 0 as a valid IRQ, I'd like to propose to use IRQ0 as a valid
one in __handle_domain_irq().

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
---
 kernel/irq/irqdesc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index a1782f8..bfbeeb6 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -365,7 +365,7 @@  int __handle_domain_irq(struct irq_domain *domain, unsigned int hwirq,
 	 * Some hardware gives randomly wrong interrupts.  Rather
 	 * than crashing, do something sensible.
 	 */
-	if (unlikely(!irq || irq >= nr_irqs)) {
+	if (unlikely(irq >= nr_irqs)) {
 		ack_bad_irq(irq);
 		ret = -EINVAL;
 	} else {
-- 
2.1.1