[1/3] pinctrl: msm: Really mask level interrupts to prevent latching
diff mbox series

Message ID 20180618205255.246104-2-swboyd@chromium.org
State New
Headers show
Series
  • pinctrl: msm interrupt and muxing fixes
Related show

Commit Message

Stephen Boyd June 18, 2018, 8:52 p.m. UTC
The interrupt controller hardware in this pin controller has two status
enable bits. The first "normal" status enable bit enables or disables
the summary interrupt line being raised when a gpio interrupt triggers
and the "raw" status enable bit allows or prevents the hardware from
latching an interrupt into the status register for a gpio interrupt.
Currently we just toggle the "normal" status enable bit in the mask and
unmask ops so that the summary irq interrupt going to the CPU's
interrupt controller doesn't trigger for the masked gpio interrupt.

For a level triggered interrupt, the flow would be as follows: the pin
controller sees the interrupt, latches the status into the status
register, raises the summary irq to the CPU, summary irq handler runs
and calls handle_level_irq(), handle_level_irq() masks and acks the gpio
interrupt, the interrupt handler runs, and finally unmask the interrupt.
When the interrupt handler completes, we expect that the interrupt line
level will go back to the deasserted state so the genirq code can unmask
the interrupt without it triggering again.

If we only mask the interrupt by clearing the "normal" status enable bit
then we'll ack the interrupt but it will continue to show up as pending
in the status register because the raw status bit is enabled, the
hardware hasn't deasserted the line, and thus the asserted state latches
into the status register again. When the hardware deasserts the
interrupt the pin controller still thinks there is a pending unserviced
level interrupt because it latched it earlier. This behavior causes
software to see an extra interrupt for level type interrupts each time
the interrupt is handled.

Let's fix this by clearing the raw status enable bit for level type
interrupts so that the hardware stops latching the status of the
interrupt after we ack it. We don't do this for edge type interrupts
because it seems that toggling the raw status enable bit for edge type
interrupts causes spurious edge interrupts.

Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Doug Anderson <dianders@chromium.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
---
 drivers/pinctrl/qcom/pinctrl-msm.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

Comments

Doug Anderson June 18, 2018, 10:43 p.m. UTC | #1
Hi,

On Mon, Jun 18, 2018 at 1:52 PM, Stephen Boyd <swboyd@chromium.org> wrote:
> The interrupt controller hardware in this pin controller has two status
> enable bits. The first "normal" status enable bit enables or disables
> the summary interrupt line being raised when a gpio interrupt triggers
> and the "raw" status enable bit allows or prevents the hardware from
> latching an interrupt into the status register for a gpio interrupt.
> Currently we just toggle the "normal" status enable bit in the mask and
> unmask ops so that the summary irq interrupt going to the CPU's
> interrupt controller doesn't trigger for the masked gpio interrupt.
>
> For a level triggered interrupt, the flow would be as follows: the pin
> controller sees the interrupt, latches the status into the status
> register, raises the summary irq to the CPU, summary irq handler runs
> and calls handle_level_irq(), handle_level_irq() masks and acks the gpio
> interrupt, the interrupt handler runs, and finally unmask the interrupt.
> When the interrupt handler completes, we expect that the interrupt line
> level will go back to the deasserted state so the genirq code can unmask
> the interrupt without it triggering again.
>
> If we only mask the interrupt by clearing the "normal" status enable bit
> then we'll ack the interrupt but it will continue to show up as pending
> in the status register because the raw status bit is enabled, the
> hardware hasn't deasserted the line, and thus the asserted state latches
> into the status register again. When the hardware deasserts the
> interrupt the pin controller still thinks there is a pending unserviced
> level interrupt because it latched it earlier. This behavior causes
> software to see an extra interrupt for level type interrupts each time
> the interrupt is handled.
>
> Let's fix this by clearing the raw status enable bit for level type
> interrupts so that the hardware stops latching the status of the
> interrupt after we ack it. We don't do this for edge type interrupts
> because it seems that toggling the raw status enable bit for edge type
> interrupts causes spurious edge interrupts.
>
> Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
> Cc: Doug Anderson <dianders@chromium.org>
> Signed-off-by: Stephen Boyd <swboyd@chromium.org>
> ---
>  drivers/pinctrl/qcom/pinctrl-msm.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
>
> diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c b/drivers/pinctrl/qcom/pinctrl-msm.c
> index 0e22f52b2a19..3563c4394837 100644
> --- a/drivers/pinctrl/qcom/pinctrl-msm.c
> +++ b/drivers/pinctrl/qcom/pinctrl-msm.c
> @@ -626,6 +626,19 @@ static void msm_gpio_irq_mask(struct irq_data *d)
>         raw_spin_lock_irqsave(&pctrl->lock, flags);
>
>         val = readl(pctrl->regs + g->intr_cfg_reg);
> +       /*
> +        * Leaving the RAW_STATUS_EN bit enabled causes level interrupts that
> +        * are still asserted to re-latch after we ack them. Clear the raw

As I understand it acking a level-triggered interrupt is not totally
sensible anyway. IMO the fundamental "brokenness" of the MSM pin
controller is that it's latching level-triggered interrupts in the
first place when it shouldn't be. Instead of latching level triggered
interrupts (and requiring an Ack), the controller should simply stop
asserting its interrupt when the level goes away (if you want, you can
think of this in your mind as the controller auto-acking its own
interrupt when the level goes away).

The reason things aren't more broken than they are is that the Linux
interrupt core still Acks level-triggered interrupts (it just does it
at the beginning instead of the end).  IMO this is not fundamentally
something that the Linux interrupt core needs to do, but it does it
anyway (maybe there are a bunch of interrupt controllers that still
need this even though it's strange?)

Of course, maybe someone can tell me that I'm wrong and everyone else
expects that level-triggered interrupts are fundamentally supposed to
be Acked but that they are only supposed to re-latch if they
transition away from their current level and then back again?


> +        * status enable bit too so the interrupt can't even latch into the
> +        * hardware while it's masked, but only do this for level interrupts
> +        * because edge interrupts have a problem with the raw status bit
> +        * toggling and causing spurious interrupts.

Even if there was no spurious interrupt it would be wrong to use
RAW_STATUS_EN for edge-triggered interrupts.  For edge-triggered
interrupts it's important to continue to latch assertions even when
masked.



> +        */
> +       if (irqd_get_trigger_type(d) & IRQ_TYPE_LEVEL_MASK) {
> +               val &= ~BIT(g->intr_raw_status_bit);
> +               writel(val, pctrl->regs + g->intr_cfg_reg);

Do you know if it's important to do a 2nd write here, or could this be
combined with the next writel()?

> +       }
> +
>         val &= ~BIT(g->intr_enable_bit);
>         writel(val, pctrl->regs + g->intr_cfg_reg);
>
> @@ -647,6 +660,10 @@ static void msm_gpio_irq_unmask(struct irq_data *d)
>         raw_spin_lock_irqsave(&pctrl->lock, flags);
>
>         val = readl(pctrl->regs + g->intr_cfg_reg);
> +       if (irqd_get_trigger_type(d) & IRQ_TYPE_LEVEL_MASK) {
> +               val |= BIT(g->intr_raw_status_bit);
> +               writel(val, pctrl->regs + g->intr_cfg_reg);

Same question about whether this could be combined with the next
writel().  ...although I could imagine that the answer might be
different for mask and unmask.

...if it can be combined, you can totally get rid of the "if" test and
always "OR" in the bit, right?


Despite the above comments, I'm in favor of this patch.  I'd be
curious about whether we can remove the two writel() calls, but I'd
only suggest the comments be changed if someone else agrees with me
about the fundamental nature of level-triggered interrupts with
regards to Acking.  :-P

-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-gpio" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Boyd June 18, 2018, 11:28 p.m. UTC | #2
Quoting Doug Anderson (2018-06-18 15:43:06)
> 
> On Mon, Jun 18, 2018 at 1:52 PM, Stephen Boyd <swboyd@chromium.org> wrote:
> 
> > +        */
> > +       if (irqd_get_trigger_type(d) & IRQ_TYPE_LEVEL_MASK) {
> > +               val &= ~BIT(g->intr_raw_status_bit);
> > +               writel(val, pctrl->regs + g->intr_cfg_reg);
> 
> Do you know if it's important to do a 2nd write here, or could this be
> combined with the next writel()?

I haven't tried combining the writes. It felt safer to keep them split
up so that both bits don't toggle at the same time, but I don't know if
it actually matters.

> 
> > +       }
> > +
> >         val &= ~BIT(g->intr_enable_bit);
> >         writel(val, pctrl->regs + g->intr_cfg_reg);
> >
> > @@ -647,6 +660,10 @@ static void msm_gpio_irq_unmask(struct irq_data *d)
> >         raw_spin_lock_irqsave(&pctrl->lock, flags);
> >
> >         val = readl(pctrl->regs + g->intr_cfg_reg);
> > +       if (irqd_get_trigger_type(d) & IRQ_TYPE_LEVEL_MASK) {
> > +               val |= BIT(g->intr_raw_status_bit);
> > +               writel(val, pctrl->regs + g->intr_cfg_reg);
> 
> Same question about whether this could be combined with the next
> writel().  ...although I could imagine that the answer might be
> different for mask and unmask.

We probably need someone from qcom side to determine if these can be
combined. I can give it a try and see if anything goes wrong but my
confidence level will only be anecdotal. It's worth a shot.

> 
> ...if it can be combined, you can totally get rid of the "if" test and
> always "OR" in the bit, right?

Yes.

--
To unsubscribe from this list: send the line "unsubscribe linux-gpio" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Anderson June 18, 2018, 11:38 p.m. UTC | #3
Hi,

On Mon, Jun 18, 2018 at 4:28 PM, Stephen Boyd <swboyd@chromium.org> wrote:
> Quoting Doug Anderson (2018-06-18 15:43:06)
>>
>> On Mon, Jun 18, 2018 at 1:52 PM, Stephen Boyd <swboyd@chromium.org> wrote:
>>
>> > +        */
>> > +       if (irqd_get_trigger_type(d) & IRQ_TYPE_LEVEL_MASK) {
>> > +               val &= ~BIT(g->intr_raw_status_bit);
>> > +               writel(val, pctrl->regs + g->intr_cfg_reg);
>>
>> Do you know if it's important to do a 2nd write here, or could this be
>> combined with the next writel()?
>
> I haven't tried combining the writes. It felt safer to keep them split
> up so that both bits don't toggle at the same time, but I don't know if
> it actually matters.

Maybe I'm a glutton for punishment, but I'd say go for it, unless
someone from Qualcomm says "no way".

In the very least in the "unmask" case it seems pretty safe.  IMHO if
re-enabling the "raw" status caused a glitch we'd already be hitting
problems.  Specifically the glitch would end up getting latched
(whee!) and then we'd unmask and see the glitch anyway.

...and actually for the "mask" case it seems like you've written it
the less-safe way anyway.  We know that masking can't cause some sort
of glitch (since that's the old code), but I guess we don't know
whether disabling the "raw" status could cause a glitch.  To be the
absolutely safest you'd do the new disable of the "raw" status _after_
the old masking.  ...but as per above I'd just go whole hog and
combine them.  :-P

As with everything I write, feel free to tell me I'm being stupid and
I'll try to shut up.  ;-)


-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-gpio" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Boyd June 19, 2018, 9:14 p.m. UTC | #4
Quoting Doug Anderson (2018-06-18 16:38:27)
> Hi,
> 
> On Mon, Jun 18, 2018 at 4:28 PM, Stephen Boyd <swboyd@chromium.org> wrote:
> > Quoting Doug Anderson (2018-06-18 15:43:06)
> >>
> >> On Mon, Jun 18, 2018 at 1:52 PM, Stephen Boyd <swboyd@chromium.org> wrote:
> >>
> >> > +        */
> >> > +       if (irqd_get_trigger_type(d) & IRQ_TYPE_LEVEL_MASK) {
> >> > +               val &= ~BIT(g->intr_raw_status_bit);
> >> > +               writel(val, pctrl->regs + g->intr_cfg_reg);
> >>
> >> Do you know if it's important to do a 2nd write here, or could this be
> >> combined with the next writel()?
> >
> > I haven't tried combining the writes. It felt safer to keep them split
> > up so that both bits don't toggle at the same time, but I don't know if
> > it actually matters.
> 
> Maybe I'm a glutton for punishment, but I'd say go for it, unless
> someone from Qualcomm says "no way".
> 
> In the very least in the "unmask" case it seems pretty safe.  IMHO if
> re-enabling the "raw" status caused a glitch we'd already be hitting
> problems.  Specifically the glitch would end up getting latched
> (whee!) and then we'd unmask and see the glitch anyway.
> 
> ...and actually for the "mask" case it seems like you've written it
> the less-safe way anyway.  We know that masking can't cause some sort
> of glitch (since that's the old code), but I guess we don't know
> whether disabling the "raw" status could cause a glitch.  To be the
> absolutely safest you'd do the new disable of the "raw" status _after_
> the old masking.  ...but as per above I'd just go whole hog and
> combine them.  :-P
> 
> As with everything I write, feel free to tell me I'm being stupid and
> I'll try to shut up.  ;-)
> 

I've tested it and it seems to work by combining the writes in mask and
unmask. I will resend it with the combination tomorrow or the next day
in case anyone else has comments on this series.

--
To unsubscribe from this list: send the line "unsubscribe linux-gpio" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Andersson June 20, 2018, 6:45 a.m. UTC | #5
On Mon 18 Jun 13:52 PDT 2018, Stephen Boyd wrote:

> The interrupt controller hardware in this pin controller has two status
> enable bits. The first "normal" status enable bit enables or disables
> the summary interrupt line being raised when a gpio interrupt triggers
> and the "raw" status enable bit allows or prevents the hardware from
> latching an interrupt into the status register for a gpio interrupt.
> Currently we just toggle the "normal" status enable bit in the mask and
> unmask ops so that the summary irq interrupt going to the CPU's
> interrupt controller doesn't trigger for the masked gpio interrupt.
> 
> For a level triggered interrupt, the flow would be as follows: the pin
> controller sees the interrupt, latches the status into the status
> register, raises the summary irq to the CPU, summary irq handler runs
> and calls handle_level_irq(), handle_level_irq() masks and acks the gpio
> interrupt, the interrupt handler runs, and finally unmask the interrupt.
> When the interrupt handler completes, we expect that the interrupt line
> level will go back to the deasserted state so the genirq code can unmask
> the interrupt without it triggering again.
> 
> If we only mask the interrupt by clearing the "normal" status enable bit
> then we'll ack the interrupt but it will continue to show up as pending
> in the status register because the raw status bit is enabled, the
> hardware hasn't deasserted the line, and thus the asserted state latches
> into the status register again. When the hardware deasserts the
> interrupt the pin controller still thinks there is a pending unserviced
> level interrupt because it latched it earlier. This behavior causes
> software to see an extra interrupt for level type interrupts each time
> the interrupt is handled.
> 
> Let's fix this by clearing the raw status enable bit for level type
> interrupts so that the hardware stops latching the status of the
> interrupt after we ack it. We don't do this for edge type interrupts
> because it seems that toggling the raw status enable bit for edge type
> interrupts causes spurious edge interrupts.
> 
> Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
> Cc: Doug Anderson <dianders@chromium.org>
> Signed-off-by: Stephen Boyd <swboyd@chromium.org>
> ---
>  drivers/pinctrl/qcom/pinctrl-msm.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c b/drivers/pinctrl/qcom/pinctrl-msm.c
> index 0e22f52b2a19..3563c4394837 100644
> --- a/drivers/pinctrl/qcom/pinctrl-msm.c
> +++ b/drivers/pinctrl/qcom/pinctrl-msm.c
> @@ -626,6 +626,19 @@ static void msm_gpio_irq_mask(struct irq_data *d)
>  	raw_spin_lock_irqsave(&pctrl->lock, flags);
>  
>  	val = readl(pctrl->regs + g->intr_cfg_reg);
> +	/*
> +	 * Leaving the RAW_STATUS_EN bit enabled causes level interrupts that
> +	 * are still asserted to re-latch after we ack them. Clear the raw
> +	 * status enable bit too so the interrupt can't even latch into the
> +	 * hardware while it's masked, but only do this for level interrupts
> +	 * because edge interrupts have a problem with the raw status bit
> +	 * toggling and causing spurious interrupts.
> +	 */
> +	if (irqd_get_trigger_type(d) & IRQ_TYPE_LEVEL_MASK) {
> +		val &= ~BIT(g->intr_raw_status_bit);
> +		writel(val, pctrl->regs + g->intr_cfg_reg);
> +	}
> +
>  	val &= ~BIT(g->intr_enable_bit);
>  	writel(val, pctrl->regs + g->intr_cfg_reg);
>  
> @@ -647,6 +660,10 @@ static void msm_gpio_irq_unmask(struct irq_data *d)
>  	raw_spin_lock_irqsave(&pctrl->lock, flags);
>  
>  	val = readl(pctrl->regs + g->intr_cfg_reg);
> +	if (irqd_get_trigger_type(d) & IRQ_TYPE_LEVEL_MASK) {
> +		val |= BIT(g->intr_raw_status_bit);
> +		writel(val, pctrl->regs + g->intr_cfg_reg);
> +	}
>  	val |= BIT(g->intr_enable_bit);
>  	writel(val, pctrl->regs + g->intr_cfg_reg);

I looked at the TLMM documentation, which states that the status bit
should be cleared after handling the interrupt and this driver used to
do this.

But Timur managed to hit the race where we lost edge triggered
interrupts with this behavior, so we changed it in the following commit:

a6566710adaa ("pinctrl: qcom: Don't clear status bit on irq_unmask")


But the reason that I had this in the driver originally is that msm-3.10
does this (clear status bit in unmask), so perhaps the appropriate way
to solve is to follow the documentation and the downstream driver and
ack the interrupt in unmask - but do so only for level triggered
interrupts?

Regards,
Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-gpio" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Anderson June 20, 2018, 3:52 p.m. UTC | #6
Hi,

On Tue, Jun 19, 2018 at 11:45 PM, Bjorn Andersson
<bjorn.andersson@linaro.org> wrote:
> On Mon 18 Jun 13:52 PDT 2018, Stephen Boyd wrote:
>
>> The interrupt controller hardware in this pin controller has two status
>> enable bits. The first "normal" status enable bit enables or disables
>> the summary interrupt line being raised when a gpio interrupt triggers
>> and the "raw" status enable bit allows or prevents the hardware from
>> latching an interrupt into the status register for a gpio interrupt.
>> Currently we just toggle the "normal" status enable bit in the mask and
>> unmask ops so that the summary irq interrupt going to the CPU's
>> interrupt controller doesn't trigger for the masked gpio interrupt.
>>
>> For a level triggered interrupt, the flow would be as follows: the pin
>> controller sees the interrupt, latches the status into the status
>> register, raises the summary irq to the CPU, summary irq handler runs
>> and calls handle_level_irq(), handle_level_irq() masks and acks the gpio
>> interrupt, the interrupt handler runs, and finally unmask the interrupt.
>> When the interrupt handler completes, we expect that the interrupt line
>> level will go back to the deasserted state so the genirq code can unmask
>> the interrupt without it triggering again.
>>
>> If we only mask the interrupt by clearing the "normal" status enable bit
>> then we'll ack the interrupt but it will continue to show up as pending
>> in the status register because the raw status bit is enabled, the
>> hardware hasn't deasserted the line, and thus the asserted state latches
>> into the status register again. When the hardware deasserts the
>> interrupt the pin controller still thinks there is a pending unserviced
>> level interrupt because it latched it earlier. This behavior causes
>> software to see an extra interrupt for level type interrupts each time
>> the interrupt is handled.
>>
>> Let's fix this by clearing the raw status enable bit for level type
>> interrupts so that the hardware stops latching the status of the
>> interrupt after we ack it. We don't do this for edge type interrupts
>> because it seems that toggling the raw status enable bit for edge type
>> interrupts causes spurious edge interrupts.
>>
>> Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
>> Cc: Doug Anderson <dianders@chromium.org>
>> Signed-off-by: Stephen Boyd <swboyd@chromium.org>
>> ---
>>  drivers/pinctrl/qcom/pinctrl-msm.c | 17 +++++++++++++++++
>>  1 file changed, 17 insertions(+)
>>
>> diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c b/drivers/pinctrl/qcom/pinctrl-msm.c
>> index 0e22f52b2a19..3563c4394837 100644
>> --- a/drivers/pinctrl/qcom/pinctrl-msm.c
>> +++ b/drivers/pinctrl/qcom/pinctrl-msm.c
>> @@ -626,6 +626,19 @@ static void msm_gpio_irq_mask(struct irq_data *d)
>>       raw_spin_lock_irqsave(&pctrl->lock, flags);
>>
>>       val = readl(pctrl->regs + g->intr_cfg_reg);
>> +     /*
>> +      * Leaving the RAW_STATUS_EN bit enabled causes level interrupts that
>> +      * are still asserted to re-latch after we ack them. Clear the raw
>> +      * status enable bit too so the interrupt can't even latch into the
>> +      * hardware while it's masked, but only do this for level interrupts
>> +      * because edge interrupts have a problem with the raw status bit
>> +      * toggling and causing spurious interrupts.
>> +      */
>> +     if (irqd_get_trigger_type(d) & IRQ_TYPE_LEVEL_MASK) {
>> +             val &= ~BIT(g->intr_raw_status_bit);
>> +             writel(val, pctrl->regs + g->intr_cfg_reg);
>> +     }
>> +
>>       val &= ~BIT(g->intr_enable_bit);
>>       writel(val, pctrl->regs + g->intr_cfg_reg);
>>
>> @@ -647,6 +660,10 @@ static void msm_gpio_irq_unmask(struct irq_data *d)
>>       raw_spin_lock_irqsave(&pctrl->lock, flags);
>>
>>       val = readl(pctrl->regs + g->intr_cfg_reg);
>> +     if (irqd_get_trigger_type(d) & IRQ_TYPE_LEVEL_MASK) {
>> +             val |= BIT(g->intr_raw_status_bit);
>> +             writel(val, pctrl->regs + g->intr_cfg_reg);
>> +     }
>>       val |= BIT(g->intr_enable_bit);
>>       writel(val, pctrl->regs + g->intr_cfg_reg);
>
> I looked at the TLMM documentation, which states that the status bit
> should be cleared after handling the interrupt and this driver used to
> do this.
>
> But Timur managed to hit the race where we lost edge triggered
> interrupts with this behavior, so we changed it in the following commit:
>
> a6566710adaa ("pinctrl: qcom: Don't clear status bit on irq_unmask")
>
>
> But the reason that I had this in the driver originally is that msm-3.10
> does this (clear status bit in unmask), so perhaps the appropriate way
> to solve is to follow the documentation and the downstream driver and
> ack the interrupt in unmask - but do so only for level triggered
> interrupts?

LOL.  Stephen and I had an offline discussion about this and acking in
the unmask was my preferred solution too.  It matches what I did to
solve the same problem on a different controller back in 2013.  See
commit 5a68e7a748c0 ("pinctrl: exynos: ack level-triggered interrupts
before unmasking").  Stephen was of the opinion that it was cleaner to
fully disable the interrupt and that the "ack on unmask" was a bit of
a hack.  I figured that both worked OK and so I didn't insist.

Anyway, either is fine with me w/ a slight personal preference to
acking in unmask.

-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-gpio" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Boyd June 21, 2018, 3:14 p.m. UTC | #7
Quoting Bjorn Andersson (2018-06-19 23:45:09)
> On Mon 18 Jun 13:52 PDT 2018, Stephen Boyd wrote:
> > @@ -647,6 +660,10 @@ static void msm_gpio_irq_unmask(struct irq_data *d)
> >       raw_spin_lock_irqsave(&pctrl->lock, flags);
> >  
> >       val = readl(pctrl->regs + g->intr_cfg_reg);
> > +     if (irqd_get_trigger_type(d) & IRQ_TYPE_LEVEL_MASK) {
> > +             val |= BIT(g->intr_raw_status_bit);
> > +             writel(val, pctrl->regs + g->intr_cfg_reg);
> > +     }
> >       val |= BIT(g->intr_enable_bit);
> >       writel(val, pctrl->regs + g->intr_cfg_reg);
> 
> I looked at the TLMM documentation, which states that the status bit
> should be cleared after handling the interrupt and this driver used to
> do this.

Nice!

> 
> But Timur managed to hit the race where we lost edge triggered
> interrupts with this behavior, so we changed it in the following commit:
> 
> a6566710adaa ("pinctrl: qcom: Don't clear status bit on irq_unmask")
> 
> 
> But the reason that I had this in the driver originally is that msm-3.10
> does this (clear status bit in unmask), so perhaps the appropriate way
> to solve is to follow the documentation and the downstream driver and
> ack the interrupt in unmask - but do so only for level triggered
> interrupts?
> 

Clearing the status bit (basically acking the gpio irq) can be done in
unmask for level triggered interrupts. That works and as you say it's
even documented.

I didn't implement that because it felt better to prevent the status
from latching in the hardware while the interrupt is masked. My
understanding of irq mask semantics is that the interrupt shouldn't be
"pending" during the time between mask and unmask and clearing the raw
status allows us to do that properly without messing with the status bit
on the unmask path. It also means that the ack operation really does ack
the irq status bit and cause it to go away. I suppose there is one case
where I'm wrong though, and that is when the irq is unmasked on irq
startup where we don't want to see a spurious latched level interrupt
that occurred before we booted.

That problem may be possible with bad bootloaders that are leaving some
status bit latched in there, but also we would want to fix that for edge
type interrupts too, so we would need to clear the status bit regardless
of the level on irq startup and hope an edge isn't lost on startup.

--
To unsubscribe from this list: send the line "unsubscribe linux-gpio" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Andersson June 22, 2018, 5:56 p.m. UTC | #8
On Thu 21 Jun 08:14 PDT 2018, Stephen Boyd wrote:

> Quoting Bjorn Andersson (2018-06-19 23:45:09)
> > On Mon 18 Jun 13:52 PDT 2018, Stephen Boyd wrote:
> > > @@ -647,6 +660,10 @@ static void msm_gpio_irq_unmask(struct irq_data *d)
> > >       raw_spin_lock_irqsave(&pctrl->lock, flags);
> > >  
> > >       val = readl(pctrl->regs + g->intr_cfg_reg);
> > > +     if (irqd_get_trigger_type(d) & IRQ_TYPE_LEVEL_MASK) {
> > > +             val |= BIT(g->intr_raw_status_bit);
> > > +             writel(val, pctrl->regs + g->intr_cfg_reg);
> > > +     }
> > >       val |= BIT(g->intr_enable_bit);
> > >       writel(val, pctrl->regs + g->intr_cfg_reg);
> > 
> > I looked at the TLMM documentation, which states that the status bit
> > should be cleared after handling the interrupt and this driver used to
> > do this.
> 
> Nice!
> 
> > 
> > But Timur managed to hit the race where we lost edge triggered
> > interrupts with this behavior, so we changed it in the following commit:
> > 
> > a6566710adaa ("pinctrl: qcom: Don't clear status bit on irq_unmask")
> > 
> > 
> > But the reason that I had this in the driver originally is that msm-3.10
> > does this (clear status bit in unmask), so perhaps the appropriate way
> > to solve is to follow the documentation and the downstream driver and
> > ack the interrupt in unmask - but do so only for level triggered
> > interrupts?
> > 
> 
> Clearing the status bit (basically acking the gpio irq) can be done in
> unmask for level triggered interrupts. That works and as you say it's
> even documented.
> 
> I didn't implement that because it felt better to prevent the status
> from latching in the hardware while the interrupt is masked. My
> understanding of irq mask semantics is that the interrupt shouldn't be
> "pending" during the time between mask and unmask and clearing the raw
> status allows us to do that properly without messing with the status bit
> on the unmask path. It also means that the ack operation really does ack
> the irq status bit and cause it to go away. I suppose there is one case
> where I'm wrong though, and that is when the irq is unmasked on irq
> startup where we don't want to see a spurious latched level interrupt
> that occurred before we booted.
> 

I took another pass through the irq code and I agree, while the late ack
does solve the problem it's more intuitive if we can prevent the
latching.

> That problem may be possible with bad bootloaders that are leaving some
> status bit latched in there, but also we would want to fix that for edge
> type interrupts too, so we would need to clear the status bit regardless
> of the level on irq startup and hope an edge isn't lost on startup.
> 

I would prefer that we handle that case explicitly.


There are some concerns from msm-3.10 regarding writing the intr-raw bit
at the same time as the other bits, but I think that relates to the fact
that the downstream driver used to configure and enabled raw-state in
the same write. So please respin v2 as you planned.

Regards,
Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-gpio" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox series

diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c b/drivers/pinctrl/qcom/pinctrl-msm.c
index 0e22f52b2a19..3563c4394837 100644
--- a/drivers/pinctrl/qcom/pinctrl-msm.c
+++ b/drivers/pinctrl/qcom/pinctrl-msm.c
@@ -626,6 +626,19 @@  static void msm_gpio_irq_mask(struct irq_data *d)
 	raw_spin_lock_irqsave(&pctrl->lock, flags);
 
 	val = readl(pctrl->regs + g->intr_cfg_reg);
+	/*
+	 * Leaving the RAW_STATUS_EN bit enabled causes level interrupts that
+	 * are still asserted to re-latch after we ack them. Clear the raw
+	 * status enable bit too so the interrupt can't even latch into the
+	 * hardware while it's masked, but only do this for level interrupts
+	 * because edge interrupts have a problem with the raw status bit
+	 * toggling and causing spurious interrupts.
+	 */
+	if (irqd_get_trigger_type(d) & IRQ_TYPE_LEVEL_MASK) {
+		val &= ~BIT(g->intr_raw_status_bit);
+		writel(val, pctrl->regs + g->intr_cfg_reg);
+	}
+
 	val &= ~BIT(g->intr_enable_bit);
 	writel(val, pctrl->regs + g->intr_cfg_reg);
 
@@ -647,6 +660,10 @@  static void msm_gpio_irq_unmask(struct irq_data *d)
 	raw_spin_lock_irqsave(&pctrl->lock, flags);
 
 	val = readl(pctrl->regs + g->intr_cfg_reg);
+	if (irqd_get_trigger_type(d) & IRQ_TYPE_LEVEL_MASK) {
+		val |= BIT(g->intr_raw_status_bit);
+		writel(val, pctrl->regs + g->intr_cfg_reg);
+	}
 	val |= BIT(g->intr_enable_bit);
 	writel(val, pctrl->regs + g->intr_cfg_reg);