Message ID | 1328621921-17404-1-git-send-email-LW@KARO-electronics.de |
---|---|
State | New |
Headers | show |
Hi, Thomas Gleixner writes: > On Tue, 7 Feb 2012, Lothar Waßmann wrote: > > > There is a race condition in the threaded IRQ handler code for oneshot > > interrupts that may lead to disabling an IRQ indefinitely. IRQs are > > masked before calling the hard-irq handler and are unmasked only after > > the soft-irq handler has been run. Thus if the hard-irq handler > > returns IRQ_HANDLED instead of IRQ_WAKE_THREAD, meaning the soft-irq > > Well, oneshot mode interrupts always had the semantics that the > threaded handler needs to run unconditionally. In fact the oneshot > mode was implemented to handle hardware which cannot do anything in > hard interrupt context to avoid the ugliness of a primary handler > calling disable_irq_nosync(). > > So it looks like driver developers decided that the oneshot mode might > be interesting with a primary handler as well. I can see the reason > why the tsc2007 driver uses it, but that does not make it a bug in the > core code in the first place. > Then maybe the core code should not check the return value of the primary handler for IRQ_WAKE_THREAD but call the secondary handler unconditionally for ONESHOT interrupts. Or it should be at least documented somewhere that primary handlers have to return IRQ_WAKE_THREAD in any case for oneshot interrupts. > > The problem arises also with interrupt controllers that latch a level > > triggered IRQ until it is acknowledged (like the i.MX28 does). > > In this case the IRQ status bit will remain asserted after the > > soft-irq finishes and retrigger the interrupt while the interrupt line > > is already deasserted. > > This does not make sense. We acknowledge interrupts via mask_ack_irq() > right on entry of handle_level_irq(). So either the interrupt > That's right. But at that point the IRQ line is still asserted and since it is a level IRQ this will not actually clear the interrupt status bit. Normally the IRQ status bit would self-clear when the IRQ line is being deasserted (in this case by removing the finger from the touch panel). But the i.MX28 leaves the IRQ status bit set and it takes another write to the IRQ status register to remove the bogus IRQ status. > controller is completely hosed or this explanation is bogus. > The first is the case. Lothar Waßmann
On Wed, 8 Feb 2012, Lothar Waßmann wrote: > > So it looks like driver developers decided that the oneshot mode might > > be interesting with a primary handler as well. I can see the reason > > why the tsc2007 driver uses it, but that does not make it a bug in the > > core code in the first place. > > > Then maybe the core code should not check the return value > of the primary handler for IRQ_WAKE_THREAD but call the secondary > handler unconditionally for ONESHOT interrupts. > Or it should be at least documented somewhere that primary handlers > have to return IRQ_WAKE_THREAD in any case for oneshot interrupts. Well, you know how good we are with documentation :) > > > The problem arises also with interrupt controllers that latch a level > > > triggered IRQ until it is acknowledged (like the i.MX28 does). > > > In this case the IRQ status bit will remain asserted after the > > > soft-irq finishes and retrigger the interrupt while the interrupt line > > > is already deasserted. > > > > This does not make sense. We acknowledge interrupts via mask_ack_irq() > > right on entry of handle_level_irq(). So either the interrupt > > > That's right. But at that point the IRQ line is still asserted and > since it is a level IRQ this will not actually clear the interrupt > status bit. Normally the IRQ status bit would self-clear when the IRQ > line is being deasserted (in this case by removing the finger from the > touch panel). But the i.MX28 leaves the IRQ status bit set and it > takes another write to the IRQ status register to remove the bogus IRQ > status. So the question is whether the imx irq chip implementation should write to the status register on unmask for level type irqs to avoid spurious interrupts being generated in the first place. This is not only an optimization for threaded interrupts, afaict this spurious effect should happen with non threaded interrupts as well. Did my patch work for you ? Thanks, tglx
Hi, Thomas Gleixner writes: > On Wed, 8 Feb 2012, Lothar Waßmann wrote: > > > > The problem arises also with interrupt controllers that latch a level > > > > triggered IRQ until it is acknowledged (like the i.MX28 does). > > > > In this case the IRQ status bit will remain asserted after the > > > > soft-irq finishes and retrigger the interrupt while the interrupt line > > > > is already deasserted. > > > > > > This does not make sense. We acknowledge interrupts via mask_ack_irq() > > > right on entry of handle_level_irq(). So either the interrupt > > > > > That's right. But at that point the IRQ line is still asserted and > > since it is a level IRQ this will not actually clear the interrupt > > status bit. Normally the IRQ status bit would self-clear when the IRQ > > line is being deasserted (in this case by removing the finger from the > > touch panel). But the i.MX28 leaves the IRQ status bit set and it > > takes another write to the IRQ status register to remove the bogus IRQ > > status. > > So the question is whether the imx irq chip implementation should > write to the status register on unmask for level type irqs to avoid > spurious interrupts being generated in the first place. This is not > only an optimization for threaded interrupts, afaict this spurious > effect should happen with non threaded interrupts as well. > > Did my patch work for you ? > Sorry, I couldn't test it earlier. Yes, it works. Lothar Waßmann
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index f7c543a..74fdef9 100644 --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -343,6 +343,8 @@ EXPORT_SYMBOL_GPL(handle_simple_irq); void handle_level_irq(unsigned int irq, struct irq_desc *desc) { + irqreturn_t ret; + raw_spin_lock(&desc->lock); mask_ack_irq(desc); @@ -360,10 +362,13 @@ handle_level_irq(unsigned int irq, struct irq_desc *desc) if (unlikely(!desc->action || irqd_irq_disabled(&desc->irq_data))) goto out_unlock; - handle_irq_event(desc); + ret = handle_irq_event(desc); - if (!irqd_irq_disabled(&desc->irq_data) && !(desc->istate & IRQS_ONESHOT)) + if (!irqd_irq_disabled(&desc->irq_data) && + (!(desc->istate & IRQS_ONESHOT) || + !(ret & IRQ_WAKE_THREAD))) unmask_irq(desc); + out_unlock: raw_spin_unlock(&desc->lock); }
There is a race condition in the threaded IRQ handler code for oneshot interrupts that may lead to disabling an IRQ indefinitely. IRQs are masked before calling the hard-irq handler and are unmasked only after the soft-irq handler has been run. Thus if the hard-irq handler returns IRQ_HANDLED instead of IRQ_WAKE_THREAD, meaning the soft-irq will not be called, the interrupt will remain masked forever. This can happen due to a short pulse on the interrupt line, that triggers the interrupt logic, but goes undetected by the hard-irq handler. The problem can be reproduced with the TSC2007 touch controller driver that uses ONESHOT interrupts. The problem arises also with interrupt controllers that latch a level triggered IRQ until it is acknowledged (like the i.MX28 does). In this case the IRQ status bit will remain asserted after the soft-irq finishes and retrigger the interrupt while the interrupt line is already deasserted. Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de> --- kernel/irq/chip.c | 9 +++++++-- 1 files changed, 7 insertions(+), 2 deletions(-)