diff mbox series

[v5,3/6] genirq/PM: Introduce IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND flag

Message ID 1598113021-4149-4-git-send-email-mkshah@codeaurora.org
State New
Headers show
Series irqchip: qcom: pdc: Introduce irq_set_wake call | expand

Commit Message

Maulik Shah Aug. 22, 2020, 4:16 p.m. UTC
An interrupt that is disabled/masked but set for wakeup still
needs to be able to wake up the system from sleep states like
"suspend to RAM".

Introduce IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND flag. If this flag
is set wake irqs will get enabled/unmasked on suspend entry by
invoking .irq_enable/.irq_unmask callback of irqchip.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Maulik Shah <mkshah@codeaurora.org>
---
 include/linux/irq.h  | 41 ++++++++++++++++++++++-------------------
 kernel/irq/debugfs.c |  1 +
 kernel/irq/pm.c      |  7 ++++++-
 3 files changed, 29 insertions(+), 20 deletions(-)

Comments

Stephen Boyd Aug. 25, 2020, 10:12 a.m. UTC | #1
Quoting Maulik Shah (2020-08-22 09:16:58)
> diff --git a/kernel/irq/pm.c b/kernel/irq/pm.c
> index c6c7e18..2cc800b 100644
> --- a/kernel/irq/pm.c
> +++ b/kernel/irq/pm.c
> @@ -69,12 +69,17 @@ void irq_pm_remove_action(struct irq_desc *desc, struct irqaction *action)
>  
>  static bool suspend_device_irq(struct irq_desc *desc)
>  {
> +       unsigned long chipflags = irq_desc_get_chip(desc)->flags;
> +
>         if (!desc->action || irq_desc_is_chained(desc) ||
>             desc->no_suspend_depth)
>                 return false;
>  
>         if (irqd_is_wakeup_set(&desc->irq_data)) {
>                 irqd_set(&desc->irq_data, IRQD_WAKEUP_ARMED);
> +
> +               if (chipflags & IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND)
> +                       irq_enable(desc);

Where is the corresponding change to resume_irq()? Don't we need to
disable an irq if it was disabled on suspend and forcibly enabled here?

>                 /*
>                  * We return true here to force the caller to issue
>                  * synchronize_irq(). We need to make sure that the
> @@ -93,7 +98,7 @@ static bool suspend_device_irq(struct irq_desc *desc)
>          * chip level. The chip implementation indicates that with
>          * IRQCHIP_MASK_ON_SUSPEND.
>          */
> -       if (irq_desc_get_chip(desc)->flags & IRQCHIP_MASK_ON_SUSPEND)
> +       if (chipflags & IRQCHIP_MASK_ON_SUSPEND)
>                 mask_irq(desc);
>         return true;
>  }
Thomas Gleixner Aug. 25, 2020, 9:38 p.m. UTC | #2
On Tue, Aug 25 2020 at 03:12, Stephen Boyd wrote:
> Quoting Maulik Shah (2020-08-22 09:16:58)
>> diff --git a/kernel/irq/pm.c b/kernel/irq/pm.c
>> index c6c7e18..2cc800b 100644
>> --- a/kernel/irq/pm.c
>> +++ b/kernel/irq/pm.c
>> @@ -69,12 +69,17 @@ void irq_pm_remove_action(struct irq_desc *desc, struct irqaction *action)
>>  
>>  static bool suspend_device_irq(struct irq_desc *desc)
>>  {
>> +       unsigned long chipflags = irq_desc_get_chip(desc)->flags;
>> +
>>         if (!desc->action || irq_desc_is_chained(desc) ||
>>             desc->no_suspend_depth)
>>                 return false;
>>  
>>         if (irqd_is_wakeup_set(&desc->irq_data)) {
>>                 irqd_set(&desc->irq_data, IRQD_WAKEUP_ARMED);
>> +
>> +               if (chipflags & IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND)
>> +                       irq_enable(desc);
>
> Where is the corresponding change to resume_irq()? Don't we need to
> disable an irq if it was disabled on suspend and forcibly enabled here?

That part was below the POC code I provided in the fine print:

 "plus the counterpart in the resume path. This also ensures that state is
  consistent."

Who reads the fine print? :)
Maulik Shah Aug. 26, 2020, 9:52 a.m. UTC | #3
Hi,

On 8/26/2020 3:08 AM, Thomas Gleixner wrote:
> On Tue, Aug 25 2020 at 03:12, Stephen Boyd wrote:
>> Quoting Maulik Shah (2020-08-22 09:16:58)
>>> diff --git a/kernel/irq/pm.c b/kernel/irq/pm.c
>>> index c6c7e18..2cc800b 100644
>>> --- a/kernel/irq/pm.c
>>> +++ b/kernel/irq/pm.c
>>> @@ -69,12 +69,17 @@ void irq_pm_remove_action(struct irq_desc *desc, struct irqaction *action)
>>>   
>>>   static bool suspend_device_irq(struct irq_desc *desc)
>>>   {
>>> +       unsigned long chipflags = irq_desc_get_chip(desc)->flags;
>>> +
>>>          if (!desc->action || irq_desc_is_chained(desc) ||
>>>              desc->no_suspend_depth)
>>>                  return false;
>>>   
>>>          if (irqd_is_wakeup_set(&desc->irq_data)) {
>>>                  irqd_set(&desc->irq_data, IRQD_WAKEUP_ARMED);
>>> +
>>> +               if (chipflags & IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND)
>>> +                       irq_enable(desc);
>> Where is the corresponding change to resume_irq()? Don't we need to
>> disable an irq if it was disabled on suspend and forcibly enabled here?
I should have added comment explaining why i did not added.
I thought of having corresponding change to resume_irq() but i did not 
kept intentionally since i didn't
observe any issue in my testing.

Actually the drivers which called (disable_irq() + enable_irq_wake()), 
are invoking enable_irq()
in the resume path everytime. With the driver's call to enable_irq() 
things are restoring back already.

If above is not true in some corner case, then the IRQ handler of driver 
won't get invoked, in such case,
why even to wake up with such IRQs in the first place, right?

However If we don't want to rely on the drivers doing things correctly, 
state can be restored in resume_irq()
I explored this, During suspend,

1. Some IRQs are unmasked/enabled + marked for wakeup
2. Some IRQs are masked/disabled + marked for wakeup

So have to track and restore only IRQs in category (2).
With current patch we don't have way to track IRQ is in (1) or (2).
It may be done with the new IRQD flag saying like 
IRQD_IRQ_ENABLED_ON_SUSPEND

During suspend,
First check if the IRQ was in disabled/masked state to invoke 
irq_enable() only for category (2) and set the new flag.

     if (irqd_irq_disabled(&desc->irq_data) && (chipflags & 
IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND)) {
         irq_enable(desc);
         irq_state_set_enabled_on_suspend(desc); => this will set new 
IRQD_IRQ_ENABLED_ON_SUSPEND
     }

During resume,
Simply calling irq_disable(desc); don't work in resume_irq(), since by 
default this API tries to lazily disable at HW, which won't quite 
restore the state,
So instead of adding below

     if (irqd_irq_disabled(&desc->irq_data) && (chipflags & 
IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND)
     && (irqd_is_enabled_on_suspend(desc)))
     {
         irq_disable(desc);
         irq_state_clear_enabled_on_suspend(desc); => clear flag
     }

we can replicate the irq_disable() with removal of lazy part, something 
like,

     if (irqd_irq_disabled(&desc->irq_data) && (chipflags & 
IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND) &&
     (irqd_is_enabled_on_suspend(desc))) { ==> The new flag used to 
determine if IRQ was enabled during suspend path, then only restore.
         irq_state_set_disabled(desc);
         if (desc->irq_data.chip->irq_disable) {
desc->irq_data.chip->irq_disable(&desc->irq_data);
             irq_state_set_masked(desc);
         } else {
         mask_irq(desc);
     }
     irq_state_clear_enabled_on_suspend(desc);
}

which is matching exactly reverse of what is done in suspend entry.
Let me know if above is good i can include this in v6.

Thanks,
Maulik

> That part was below the POC code I provided in the fine print:
>
>   "plus the counterpart in the resume path. This also ensures that state is
>    consistent."
>
> Who reads the fine print? :)
Thomas Gleixner Aug. 26, 2020, 10:15 a.m. UTC | #4
On Wed, Aug 26 2020 at 15:22, Maulik Shah wrote:
> On 8/26/2020 3:08 AM, Thomas Gleixner wrote:
>>> Where is the corresponding change to resume_irq()? Don't we need to
>>> disable an irq if it was disabled on suspend and forcibly enabled here?
>>>
> I should have added comment explaining why i did not added.
> I thought of having corresponding change to resume_irq() but i did not 
> kept intentionally since i didn't
> observe any issue in my testing.

That makes it correct in which way? Did not explode in my face is hardly
proof of anything.

> Actually the drivers which called (disable_irq() + enable_irq_wake()), 
> are invoking enable_irq()
> in the resume path everytime. With the driver's call to enable_irq() 
> things are restoring back already.

No, that's just wrong because you again create inconsistent state.

> If above is not true in some corner case, then the IRQ handler of
> driver won't get invoked, in such case, why even to wake up with such
> IRQs in the first place, right?

I don't care about the corner case. If the driver misses to do it is
buggy in the first place. Silently papering over it is just mindless
hackery.

There are two reasonable choices here:

1) Do the symmetric thing

2) Let the drivers call a new function disable_wakeup_irq_for_suspend()
   which marks the interrupt to be enabled from the core on suspend and
   remove the enable call on the resume callback of the driver.

   Then you don't need the resume part in the core and state still is
   consistent.

I'm leaning towards #2 because that makes a lot of sense.

Thanks,

        tglx
Doug Anderson Aug. 31, 2020, 3:12 p.m. UTC | #5
Hi,


On Wed, Aug 26, 2020 at 3:15 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Wed, Aug 26 2020 at 15:22, Maulik Shah wrote:
> > On 8/26/2020 3:08 AM, Thomas Gleixner wrote:
> >>> Where is the corresponding change to resume_irq()? Don't we need to
> >>> disable an irq if it was disabled on suspend and forcibly enabled here?
> >>>
> > I should have added comment explaining why i did not added.
> > I thought of having corresponding change to resume_irq() but i did not
> > kept intentionally since i didn't
> > observe any issue in my testing.
>
> That makes it correct in which way? Did not explode in my face is hardly
> proof of anything.
>
> > Actually the drivers which called (disable_irq() + enable_irq_wake()),
> > are invoking enable_irq()
> > in the resume path everytime. With the driver's call to enable_irq()
> > things are restoring back already.
>
> No, that's just wrong because you again create inconsistent state.
>
> > If above is not true in some corner case, then the IRQ handler of
> > driver won't get invoked, in such case, why even to wake up with such
> > IRQs in the first place, right?
>
> I don't care about the corner case. If the driver misses to do it is
> buggy in the first place. Silently papering over it is just mindless
> hackery.
>
> There are two reasonable choices here:
>
> 1) Do the symmetric thing
>
> 2) Let the drivers call a new function disable_wakeup_irq_for_suspend()
>    which marks the interrupt to be enabled from the core on suspend and
>    remove the enable call on the resume callback of the driver.
>
>    Then you don't need the resume part in the core and state still is
>    consistent.
>
> I'm leaning towards #2 because that makes a lot of sense.

IIUC, #2 requires that we change existing drivers that are currently
using disable_irq() + enable_irq_wake(), right?  Presumably, if we're
going to do #2, we should declare that what drivers used to do is now
considered illegal, right?  Perhaps we could detect that and throw a
warning so that they know that they need to change to use the new
disable_wakeup_irq_for_suspend() API.  Otherwise these drivers will
work fine on some systems (like they always have) but will fail in
weird corner cases for systems that are relying on drivers to call
disable_wakeup_irq_for_suspend().  That doesn't sound super great to
me...

...or, if doing the symmetric thing isn't too bad, we could do that?

-Doug
Thomas Gleixner Sept. 1, 2020, 9:51 a.m. UTC | #6
On Mon, Aug 31 2020 at 08:12, Doug Anderson wrote:
> On Wed, Aug 26, 2020 at 3:15 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>> There are two reasonable choices here:
>>
>> 1) Do the symmetric thing
>>
>> 2) Let the drivers call a new function disable_wakeup_irq_for_suspend()
>>    which marks the interrupt to be enabled from the core on suspend and
>>    remove the enable call on the resume callback of the driver.
>>
>>    Then you don't need the resume part in the core and state still is
>>    consistent.
>>
>> I'm leaning towards #2 because that makes a lot of sense.
>
> IIUC, #2 requires that we change existing drivers that are currently
> using disable_irq() + enable_irq_wake(), right?  Presumably, if we're
> going to do #2, we should declare that what drivers used to do is now
> considered illegal, right?  Perhaps we could detect that and throw a
> warning so that they know that they need to change to use the new
> disable_wakeup_irq_for_suspend() API.  Otherwise these drivers will
> work fine on some systems (like they always have) but will fail in
> weird corner cases for systems that are relying on drivers to call
> disable_wakeup_irq_for_suspend().  That doesn't sound super great to
> me...

Hmm. With disable_irq() + enable_irq_wake() in the driver suspend path
the driver already makes an implicit assumption about the underlying irq
chip functionality, i.e. it expects that even with the interrupt
disabled the irq chip can wake up the system.

Now with the new flag magic and #1 we are just working around the driver
assumptions at the interrupt chip level.

That's inconsistent at best.

How many drivers are doing that sequence?  And the more important
question is why are they calling disable_irq() in the first place if
they want to be woken up by that interrupt.

The point is that the core suspend code disables all interrupts which
are not marked as wakeup enabled automatically and reenables them after
resume. So why would any driver invoke disable_irq() in the suspend
function at all? Historical raisins?

Thanks,

        tglx
Doug Anderson Sept. 2, 2020, 8:26 p.m. UTC | #7
Hi,

On Tue, Sep 1, 2020 at 2:51 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Mon, Aug 31 2020 at 08:12, Doug Anderson wrote:
> > On Wed, Aug 26, 2020 at 3:15 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >> There are two reasonable choices here:
> >>
> >> 1) Do the symmetric thing
> >>
> >> 2) Let the drivers call a new function disable_wakeup_irq_for_suspend()
> >>    which marks the interrupt to be enabled from the core on suspend and
> >>    remove the enable call on the resume callback of the driver.
> >>
> >>    Then you don't need the resume part in the core and state still is
> >>    consistent.
> >>
> >> I'm leaning towards #2 because that makes a lot of sense.
> >
> > IIUC, #2 requires that we change existing drivers that are currently
> > using disable_irq() + enable_irq_wake(), right?  Presumably, if we're
> > going to do #2, we should declare that what drivers used to do is now
> > considered illegal, right?  Perhaps we could detect that and throw a
> > warning so that they know that they need to change to use the new
> > disable_wakeup_irq_for_suspend() API.  Otherwise these drivers will
> > work fine on some systems (like they always have) but will fail in
> > weird corner cases for systems that are relying on drivers to call
> > disable_wakeup_irq_for_suspend().  That doesn't sound super great to
> > me...
>
> Hmm. With disable_irq() + enable_irq_wake() in the driver suspend path
> the driver already makes an implicit assumption about the underlying irq
> chip functionality, i.e. it expects that even with the interrupt
> disabled the irq chip can wake up the system.
>
> Now with the new flag magic and #1 we are just working around the driver
> assumptions at the interrupt chip level.
>
> That's inconsistent at best.

Sure, though I will say that it works on all Chromebooks we've shipped
over the last ~9 years since the main cros_ec (EC = embedded
controller) driver does this.  Of course, it's easy to just change
that driver.  I just don't want everything else breaking too.


> How many drivers are doing that sequence?

I remember looking this up before but can't find it.  It's gonna be
hard to get an exact count without fancier searching, but we should be
able to find a few...  I'll just do the simple:

git grep -C10 enable_irq_wake | grep -C10 'disable_irq('

That might miss people but it'll catch quite a few.  Ones that are
clearly using something like this:

drivers/input/keyboard/adp5588-keys.c
drivers/input/keyboard/adp5589-keys.c
drivers/input/mouse/elan_i2c_core.c
drivers/input/rmi4/rmi_driver.c
drivers/input/touchscreen/elants_i2c.c
drivers/input/touchscreen/raydium_i2c_ts.c
drivers/mfd/as3722.c
drivers/mfd/max14577.c (*)
drivers/mfd/max77693.c
drivers/mfd/max77843.c
drivers/mfd/sec-core.c (*)
drivers/mfd/twl6030-irq.c
drivers/platform/chrome/cros_ec.c
drivers/power/supply/max17042_battery.c
drivers/rtc/rtc-st-lpc.c

(*) Even has a comment explaining why!

Input is perhaps over-represented but presumably that's because input
is often the thing that wakes devices up.  ;-)


> And the more important
> question is why are they calling disable_irq() in the first place if
> they want to be woken up by that interrupt.

I tried to put my thoughts back in:

https://lore.kernel.org/r/CAD=FV=WN4R1tS47ZzdZa_hsbvLifwnv6rgETVaiea0+QSZmiOw@mail.gmail.com/

...but that was a long thread.  Copied the relevant bits here.
Basically a driver that calls disablre_irq() together with
enable_irq_wake() is trying to say:

* Don't call the interrupt handler for this interrupt until I call
enable_irq() but keep tracking it (either in hardware or in software).
Specifically it's a requirement that if the interrupt fires one or
more times while masked the interrupt handler should be called as soon
as enable_irq() is called.

* If this interrupt fires while the system is suspended then please
wake the system up.

Specifically I think it gets back to the idea that, from a device
driver's point of view, there isn't a separate concept of disabling an
IRQ (turn it off and stop tracking it) and masking an IRQ (keep track
of it but don't call my handler until I unmask).  As I understand it
drivers expect that the disable_irq() call is actually a mask and that
an IRQ is never fully disabled unless released by the driver.  It is a
little unfortunate (IMO) that the function is called disable_irq() but
as far as I understand that's historical.


> The point is that the core suspend code disables all interrupts which
> are not marked as wakeup enabled automatically and reenables them after
> resume. So why would any driver invoke disable_irq() in the suspend
> function at all? Historical raisins?

One case I can imagine: pretend that there are two power rails
controlling a device.  One power rail controls the communication
channel between the CPU and the peripheral and the other power rail
controls whether the peripheral is on.  At suspend time we want to
keep the peripheral on but we can shut down the power to the
communication channel.

One way you could do this is at suspend time:
  disable_irq()
  turn_off_comm_power()
  enable_irq_wake()

You'd do the disable_irq() (AKA mask your interrupt) because you'd
really want to make sure that your handler isn't called after you
turned off the communication power.  You want to leave the interrupt
pending/masked until you are able to turn the communications channel
back on and then you can query why the wakeup happened.

Now, admittedly, you could redesign the above driver to work any
number of different ways.  Maybe you could use the "noirq" suspend to
turn off your comm power or maybe you could come up with another
solution.  However, since the above has always worked and is quite
simple I guess that's what drivers use?


-Doug
Thomas Gleixner Sept. 3, 2020, 12:57 p.m. UTC | #8
On Wed, Sep 02 2020 at 13:26, Doug Anderson wrote:
> Specifically I think it gets back to the idea that, from a device
> driver's point of view, there isn't a separate concept of disabling an
> IRQ (turn it off and stop tracking it) and masking an IRQ (keep track
> of it but don't call my handler until I unmask).  As I understand it
> drivers expect that the disable_irq() call is actually a mask and that
> an IRQ is never fully disabled unless released by the driver.  It is a
> little unfortunate (IMO) that the function is called disable_irq() but
> as far as I understand that's historical.

Yes, the naming is historical but it always meant:

Don't invoke an interrupt handler. Whether that's achieved by actually
masking it at the interrupt chip level in hardware or by software state
in the core does not matter from the driver perspective.

>> The point is that the core suspend code disables all interrupts which
>> are not marked as wakeup enabled automatically and reenables them after
>> resume. So why would any driver invoke disable_irq() in the suspend
>> function at all? Historical raisins?
>
> One case I can imagine: pretend that there are two power rails
> controlling a device.  One power rail controls the communication
> channel between the CPU and the peripheral and the other power rail
> controls whether the peripheral is on.  At suspend time we want to
> keep the peripheral on but we can shut down the power to the
> communication channel.
>
> One way you could do this is at suspend time:
>   disable_irq()
>   turn_off_comm_power()
>   enable_irq_wake()
>
> You'd do the disable_irq() (AKA mask your interrupt) because you'd
> really want to make sure that your handler isn't called after you
> turned off the communication power.  You want to leave the interrupt
> pending/masked until you are able to turn the communications channel
> back on and then you can query why the wakeup happened.

Ok.

> Now, admittedly, you could redesign the above driver to work any
> number of different ways.  Maybe you could use the "noirq" suspend to
> turn off your comm power or maybe you could come up with another
> solution.  However, since the above has always worked and is quite
> simple I guess that's what drivers use?

That comm power case is a reasonable argument for having that
sequence. So we need to make sure that the underlying interrupt chips do
the right thing.

We have the following two cases:

1) irq chip does not have a irq_disable() callback and does not
   have IRQ_DISABLE_UNLAZY set

   In that case the interrupt is not masked at the hardware level. It's
   just software state. If the interrupt fires while disabled it is
   marked pending and actually masked at the hardware level.

   Actually there is a race condition which is not handled:

   disable_irq()
   ...
   
   interrupt fires
      mask and mark pending

   ....
   suspend_device_irq()
      if (wakeup source) {
         set_state(WAKEUP ARMED);
         return;
      }

   That pending interrupt will not prevent the machine from going into
   suspend and if it's an edge interrupt then an unmask in
   suspend_device_irq() won't help. Edge interrupts are not resent in
   hardware. They are fire and forget from the POV of the device
   hardware.

2) irq chip has a irq_disable() callback or has IRQ_DISABLE_UNLAZY set

   In that case disable_irq() will mask it at the hardware level and it
   stays that way until enable_irq() is invoked.

#1 kinda works and the gap is reasonably trivial to fix in
   suspend_device_irq() by checking the pending state and telling the PM
   core that there is a wakeup pending.

#2 Needs an indication from the chip flags that an interrupt which is
   masked has to be unmasked when it is a enabled wakeup source.

I assume your problem is #2, right? If it's #1 then UNMASK_IF_WAKEUP is
the wrong answer.

Thanks,

        tglx
Doug Anderson Sept. 3, 2020, 11:19 p.m. UTC | #9
Hi,

On Thu, Sep 3, 2020 at 5:57 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Wed, Sep 02 2020 at 13:26, Doug Anderson wrote:
> > Specifically I think it gets back to the idea that, from a device
> > driver's point of view, there isn't a separate concept of disabling an
> > IRQ (turn it off and stop tracking it) and masking an IRQ (keep track
> > of it but don't call my handler until I unmask).  As I understand it
> > drivers expect that the disable_irq() call is actually a mask and that
> > an IRQ is never fully disabled unless released by the driver.  It is a
> > little unfortunate (IMO) that the function is called disable_irq() but
> > as far as I understand that's historical.
>
> Yes, the naming is historical but it always meant:
>
> Don't invoke an interrupt handler. Whether that's achieved by actually
> masking it at the interrupt chip level in hardware or by software state
> in the core does not matter from the driver perspective.
>
> >> The point is that the core suspend code disables all interrupts which
> >> are not marked as wakeup enabled automatically and reenables them after
> >> resume. So why would any driver invoke disable_irq() in the suspend
> >> function at all? Historical raisins?
> >
> > One case I can imagine: pretend that there are two power rails
> > controlling a device.  One power rail controls the communication
> > channel between the CPU and the peripheral and the other power rail
> > controls whether the peripheral is on.  At suspend time we want to
> > keep the peripheral on but we can shut down the power to the
> > communication channel.
> >
> > One way you could do this is at suspend time:
> >   disable_irq()
> >   turn_off_comm_power()
> >   enable_irq_wake()
> >
> > You'd do the disable_irq() (AKA mask your interrupt) because you'd
> > really want to make sure that your handler isn't called after you
> > turned off the communication power.  You want to leave the interrupt
> > pending/masked until you are able to turn the communications channel
> > back on and then you can query why the wakeup happened.
>
> Ok.
>
> > Now, admittedly, you could redesign the above driver to work any
> > number of different ways.  Maybe you could use the "noirq" suspend to
> > turn off your comm power or maybe you could come up with another
> > solution.  However, since the above has always worked and is quite
> > simple I guess that's what drivers use?
>
> That comm power case is a reasonable argument for having that
> sequence. So we need to make sure that the underlying interrupt chips do
> the right thing.
>
> We have the following two cases:
>
> 1) irq chip does not have a irq_disable() callback and does not
>    have IRQ_DISABLE_UNLAZY set
>
>    In that case the interrupt is not masked at the hardware level. It's
>    just software state. If the interrupt fires while disabled it is
>    marked pending and actually masked at the hardware level.
>
>    Actually there is a race condition which is not handled:
>
>    disable_irq()
>    ...
>
>    interrupt fires
>       mask and mark pending
>
>    ....
>    suspend_device_irq()
>       if (wakeup source) {
>          set_state(WAKEUP ARMED);
>          return;
>       }
>
>    That pending interrupt will not prevent the machine from going into
>    suspend and if it's an edge interrupt then an unmask in
>    suspend_device_irq() won't help. Edge interrupts are not resent in
>    hardware. They are fire and forget from the POV of the device
>    hardware.

Ah, interesting.  I didn't think about this case exactly.  I might
have a fix for it anyway.  At some point in time I was thinking that
the world could be solved by relying on lazily-disabled interrupts and
I wrote up a patch to make sure that they woke things up.  If you're
willing to check out our gerrit you can look at:

https://crrev.com/c/2314693

...if not I can post it as a RFC for you.  I'm sure I've solved the
problem in a completely incorrect and broken way, but hopefully the
idea makes sense.  In discussion we decided not to go this way because
it looked like IRQ clients could request an IRQ with
IRQ_DISABLE_UNLAZY and then that'd break us.  :(  ...but even so I
think the patch is roughly right and would address your point #1.


> 2) irq chip has a irq_disable() callback or has IRQ_DISABLE_UNLAZY set
>
>    In that case disable_irq() will mask it at the hardware level and it
>    stays that way until enable_irq() is invoked.
>
> #1 kinda works and the gap is reasonably trivial to fix in
>    suspend_device_irq() by checking the pending state and telling the PM
>    core that there is a wakeup pending.
>
> #2 Needs an indication from the chip flags that an interrupt which is
>    masked has to be unmasked when it is a enabled wakeup source.
>
> I assume your problem is #2, right? If it's #1 then UNMASK_IF_WAKEUP is
> the wrong answer.

Right, the problem is #2.  We're not in the lazy mode.

-Doug
Thomas Gleixner Sept. 4, 2020, 9:54 a.m. UTC | #10
Doug,

On Thu, Sep 03 2020 at 16:19, Doug Anderson wrote:
> On Thu, Sep 3, 2020 at 5:57 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>>    That pending interrupt will not prevent the machine from going into
>>    suspend and if it's an edge interrupt then an unmask in
>>    suspend_device_irq() won't help. Edge interrupts are not resent in
>>    hardware. They are fire and forget from the POV of the device
>>    hardware.
>
> Ah, interesting.  I didn't think about this case exactly.  I might
> have a fix for it anyway.  At some point in time I was thinking that
> the world could be solved by relying on lazily-disabled interrupts and
> I wrote up a patch to make sure that they woke things up.  If you're
> willing to check out our gerrit you can look at:
>
> https://crrev.com/c/2314693
>
> ...if not I can post it as a RFC for you.

I actually tried despite my usual aversion against web
interfaces. Aversion confirmed :)

You could have included the 5 lines of patch into your reply to spare me
the experience. :)

> I'm sure I've solved the problem in a completely incorrect and broken
> way, but hopefully the idea makes sense.  In discussion we decided not
> to go this way because it looked like IRQ clients could request an IRQ
> with IRQ_DISABLE_UNLAZY and then that'd break us.  :( ...but even so I
> think the patch is roughly right and would address your point #1.

Kinda :) But that's still incomplete because it does not handle the case
where the interrupt arrives between disable_irq() and enable_irq_wake().
See below.

>> 2) irq chip has a irq_disable() callback or has IRQ_DISABLE_UNLAZY set
>>
>>    In that case disable_irq() will mask it at the hardware level and it
>>    stays that way until enable_irq() is invoked.
>>
>> #1 kinda works and the gap is reasonably trivial to fix in
>>    suspend_device_irq() by checking the pending state and telling the PM
>>    core that there is a wakeup pending.
>>
>> #2 Needs an indication from the chip flags that an interrupt which is
>>    masked has to be unmasked when it is a enabled wakeup source.
>>
>> I assume your problem is #2, right? If it's #1 then UNMASK_IF_WAKEUP is
>> the wrong answer.
>
> Right, the problem is #2.  We're not in the lazy mode.

Right and that's where we want the new chip flag with the unmask if
armed.

Thanks,

        tglx

8<------

 kernel/irq/pm.c |   27 ++++++++++++++++++++++-----
 1 file changed, 22 insertions(+), 5 deletions(-)

--- a/kernel/irq/pm.c
+++ b/kernel/irq/pm.c
@@ -13,14 +13,19 @@
 
 #include "internals.h"
 
+static void irq_pm_do_wakeup(struct irq_desc *desc)
+{
+	irqd_clear(&desc->irq_data, IRQD_WAKEUP_ARMED);
+	desc->istate |= IRQS_SUSPENDED | IRQS_PENDING;
+	pm_system_irq_wakeup(irq_desc_get_irq(desc));
+}
+
 bool irq_pm_check_wakeup(struct irq_desc *desc)
 {
 	if (irqd_is_wakeup_armed(&desc->irq_data)) {
-		irqd_clear(&desc->irq_data, IRQD_WAKEUP_ARMED);
-		desc->istate |= IRQS_SUSPENDED | IRQS_PENDING;
 		desc->depth++;
 		irq_disable(desc);
-		pm_system_irq_wakeup(irq_desc_get_irq(desc));
+		irq_pm_do_wakeup(desc);
 		return true;
 	}
 	return false;
@@ -69,12 +74,24 @@ void irq_pm_remove_action(struct irq_des
 
 static bool suspend_device_irq(struct irq_desc *desc)
 {
+	struct irq_data *irqd = &desc->irq_data;
+
 	if (!desc->action || irq_desc_is_chained(desc) ||
 	    desc->no_suspend_depth)
 		return false;
 
-	if (irqd_is_wakeup_set(&desc->irq_data)) {
-		irqd_set(&desc->irq_data, IRQD_WAKEUP_ARMED);
+	if (irqd_is_wakeup_set(irqd)) {
+		irqd_set(irqd, IRQD_WAKEUP_ARMED);
+		/*
+		 * Interrupt might have been disabled in the suspend
+		 * sequence before the wakeup was enabled. If the interrupt
+		 * is lazy masked then it might have fired and the pending
+		 * bit is set. Ignoring this would miss the wakeup.
+		 */
+		if (irqd_irq_disabled(irqd) && desc->istate & IRQS_PENDING) {
+			irq_pm_do_wakeup(desc);
+			return false;
+		}
 		/*
 		 * We return true here to force the caller to issue
 		 * synchronize_irq(). We need to make sure that the
Doug Anderson Sept. 8, 2020, 7:05 p.m. UTC | #11
Hi,

On Fri, Sep 4, 2020 at 2:54 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Doug,
>
> On Thu, Sep 03 2020 at 16:19, Doug Anderson wrote:
> > On Thu, Sep 3, 2020 at 5:57 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >>    That pending interrupt will not prevent the machine from going into
> >>    suspend and if it's an edge interrupt then an unmask in
> >>    suspend_device_irq() won't help. Edge interrupts are not resent in
> >>    hardware. They are fire and forget from the POV of the device
> >>    hardware.
> >
> > Ah, interesting.  I didn't think about this case exactly.  I might
> > have a fix for it anyway.  At some point in time I was thinking that
> > the world could be solved by relying on lazily-disabled interrupts and
> > I wrote up a patch to make sure that they woke things up.  If you're
> > willing to check out our gerrit you can look at:
> >
> > https://crrev.com/c/2314693
> >
> > ...if not I can post it as a RFC for you.
>
> I actually tried despite my usual aversion against web
> interfaces. Aversion confirmed :)
>
> You could have included the 5 lines of patch into your reply to spare me
> the experience. :)

Sorry!  :(  Inline patches are a bit of a pain for me since I'm
certifiably insane and use the gmail web interface for kernel mailing
lists.  Everyone has their pet aversions, I guess.  ;-)


> > I'm sure I've solved the problem in a completely incorrect and broken
> > way, but hopefully the idea makes sense.  In discussion we decided not
> > to go this way because it looked like IRQ clients could request an IRQ
> > with IRQ_DISABLE_UNLAZY and then that'd break us.  :( ...but even so I
> > think the patch is roughly right and would address your point #1.
>
> Kinda :) But that's still incomplete because it does not handle the case
> where the interrupt arrives between disable_irq() and enable_irq_wake().
> See below.

Huh, I thought I'd handled this with the code in irq_set_irq_wake()
which checked if it was pending and did a wakeup.  In any case, I
trust your understanding of this code far better than I trust mine.
How should we proceed then?  Do you want to post up an official patch?

At the moment I don't have any test cases that need your patch since
the interrupts I'm dealing with are not lazily disabled.  However, I
still do agree that it's the right thing to do.


> >> 2) irq chip has a irq_disable() callback or has IRQ_DISABLE_UNLAZY set
> >>
> >>    In that case disable_irq() will mask it at the hardware level and it
> >>    stays that way until enable_irq() is invoked.
> >>
> >> #1 kinda works and the gap is reasonably trivial to fix in
> >>    suspend_device_irq() by checking the pending state and telling the PM
> >>    core that there is a wakeup pending.
> >>
> >> #2 Needs an indication from the chip flags that an interrupt which is
> >>    masked has to be unmasked when it is a enabled wakeup source.
> >>
> >> I assume your problem is #2, right? If it's #1 then UNMASK_IF_WAKEUP is
> >> the wrong answer.
> >
> > Right, the problem is #2.  We're not in the lazy mode.
>
> Right and that's where we want the new chip flag with the unmask if
> armed.

OK, so we're back in Maulik's court to spin, right?  I think the last
word before our tangent was at:

http://lore.kernel.org/r/87y2m1vhkm.fsf@nanos.tec.linutronix.de

There you were leaning towards #2 ("a new function
disable_wakeup_irq_for_suspend()").  Presumably you'd now be
suggesting #1 ("Do the symmetric thing") since I've pointed out the
bunch of drivers that would need to change.


-Doug
Thomas Gleixner Sept. 10, 2020, 8:51 a.m. UTC | #12
On Tue, Sep 08 2020 at 12:05, Doug Anderson wrote:
> On Fri, Sep 4, 2020 at 2:54 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>> Right and that's where we want the new chip flag with the unmask if
>> armed.
>
> OK, so we're back in Maulik's court to spin, right?  I think the last
> word before our tangent was at:
>
> http://lore.kernel.org/r/87y2m1vhkm.fsf@nanos.tec.linutronix.de
>
> There you were leaning towards #2 ("a new function
> disable_wakeup_irq_for_suspend()").  Presumably you'd now be
> suggesting #1 ("Do the symmetric thing") since I've pointed out the
> bunch of drivers that would need to change.

Yes #1 is what we need.

Thanks,

        tglx
diff mbox series

Patch

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 1b7f4df..752eb9a 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -545,27 +545,30 @@  struct irq_chip {
 /*
  * irq_chip specific flags
  *
- * IRQCHIP_SET_TYPE_MASKED:	Mask before calling chip.irq_set_type()
- * IRQCHIP_EOI_IF_HANDLED:	Only issue irq_eoi() when irq was handled
- * IRQCHIP_MASK_ON_SUSPEND:	Mask non wake irqs in the suspend path
- * IRQCHIP_ONOFFLINE_ENABLED:	Only call irq_on/off_line callbacks
- *				when irq enabled
- * IRQCHIP_SKIP_SET_WAKE:	Skip chip.irq_set_wake(), for this irq chip
- * IRQCHIP_ONESHOT_SAFE:	One shot does not require mask/unmask
- * IRQCHIP_EOI_THREADED:	Chip requires eoi() on unmask in threaded mode
- * IRQCHIP_SUPPORTS_LEVEL_MSI	Chip can provide two doorbells for Level MSIs
- * IRQCHIP_SUPPORTS_NMI:	Chip can deliver NMIs, only for root irqchips
+ * IRQCHIP_SET_TYPE_MASKED:           Mask before calling chip.irq_set_type()
+ * IRQCHIP_EOI_IF_HANDLED:            Only issue irq_eoi() when irq was handled
+ * IRQCHIP_MASK_ON_SUSPEND:           Mask non wake irqs in the suspend path
+ * IRQCHIP_ONOFFLINE_ENABLED:         Only call irq_on/off_line callbacks
+ *                                    when irq enabled
+ * IRQCHIP_SKIP_SET_WAKE:             Skip chip.irq_set_wake(), for this irq chip
+ * IRQCHIP_ONESHOT_SAFE:              One shot does not require mask/unmask
+ * IRQCHIP_EOI_THREADED:              Chip requires eoi() on unmask in threaded mode
+ * IRQCHIP_SUPPORTS_LEVEL_MSI:        Chip can provide two doorbells for Level MSIs
+ * IRQCHIP_SUPPORTS_NMI:              Chip can deliver NMIs, only for root irqchips
+ * IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND:  Invoke .irq_enable/.irq_unmask for wake irqs
+ *                                    in the suspend path
  */
 enum {
-	IRQCHIP_SET_TYPE_MASKED		= (1 <<  0),
-	IRQCHIP_EOI_IF_HANDLED		= (1 <<  1),
-	IRQCHIP_MASK_ON_SUSPEND		= (1 <<  2),
-	IRQCHIP_ONOFFLINE_ENABLED	= (1 <<  3),
-	IRQCHIP_SKIP_SET_WAKE		= (1 <<  4),
-	IRQCHIP_ONESHOT_SAFE		= (1 <<  5),
-	IRQCHIP_EOI_THREADED		= (1 <<  6),
-	IRQCHIP_SUPPORTS_LEVEL_MSI	= (1 <<  7),
-	IRQCHIP_SUPPORTS_NMI		= (1 <<  8),
+	IRQCHIP_SET_TYPE_MASKED			= (1 <<  0),
+	IRQCHIP_EOI_IF_HANDLED			= (1 <<  1),
+	IRQCHIP_MASK_ON_SUSPEND			= (1 <<  2),
+	IRQCHIP_ONOFFLINE_ENABLED		= (1 <<  3),
+	IRQCHIP_SKIP_SET_WAKE			= (1 <<  4),
+	IRQCHIP_ONESHOT_SAFE			= (1 <<  5),
+	IRQCHIP_EOI_THREADED			= (1 <<  6),
+	IRQCHIP_SUPPORTS_LEVEL_MSI		= (1 <<  7),
+	IRQCHIP_SUPPORTS_NMI			= (1 <<  8),
+	IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND	= (1 <<  9),
 };
 
 #include <linux/irqdesc.h>
diff --git a/kernel/irq/debugfs.c b/kernel/irq/debugfs.c
index b95ff5d..ab4f637 100644
--- a/kernel/irq/debugfs.c
+++ b/kernel/irq/debugfs.c
@@ -57,6 +57,7 @@  static const struct irq_bit_descr irqchip_flags[] = {
 	BIT_MASK_DESCR(IRQCHIP_EOI_THREADED),
 	BIT_MASK_DESCR(IRQCHIP_SUPPORTS_LEVEL_MSI),
 	BIT_MASK_DESCR(IRQCHIP_SUPPORTS_NMI),
+	BIT_MASK_DESCR(IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND),
 };
 
 static void
diff --git a/kernel/irq/pm.c b/kernel/irq/pm.c
index c6c7e18..2cc800b 100644
--- a/kernel/irq/pm.c
+++ b/kernel/irq/pm.c
@@ -69,12 +69,17 @@  void irq_pm_remove_action(struct irq_desc *desc, struct irqaction *action)
 
 static bool suspend_device_irq(struct irq_desc *desc)
 {
+	unsigned long chipflags = irq_desc_get_chip(desc)->flags;
+
 	if (!desc->action || irq_desc_is_chained(desc) ||
 	    desc->no_suspend_depth)
 		return false;
 
 	if (irqd_is_wakeup_set(&desc->irq_data)) {
 		irqd_set(&desc->irq_data, IRQD_WAKEUP_ARMED);
+
+		if (chipflags & IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND)
+			irq_enable(desc);
 		/*
 		 * We return true here to force the caller to issue
 		 * synchronize_irq(). We need to make sure that the
@@ -93,7 +98,7 @@  static bool suspend_device_irq(struct irq_desc *desc)
 	 * chip level. The chip implementation indicates that with
 	 * IRQCHIP_MASK_ON_SUSPEND.
 	 */
-	if (irq_desc_get_chip(desc)->flags & IRQCHIP_MASK_ON_SUSPEND)
+	if (chipflags & IRQCHIP_MASK_ON_SUSPEND)
 		mask_irq(desc);
 	return true;
 }