[1/9] i2c: designware: Fix system suspend

Message ID 1498072888-14782-2-git-send-email-ulf.hansson@linaro.org
State Superseded
Headers show

Commit Message

Ulf Hansson June 21, 2017, 7:21 p.m.
The commit 8503ff166504 ("i2c: designware: Avoid unnecessary resuming
during system suspend"), may suggest to the PM core to try out the so
called direct_complete path for system sleep. In this path, the PM core
treats a runtime suspended device as it's already in a proper low power
state for system sleep, which makes it skip calling the system sleep
callbacks for the device, except for the ->prepare() and the ->complete()
callback.

Moreover, under certain circumstances the PM core may unset the
direct_complete flag for a parent device, in case its child device are
being system suspended before. In other words, the PM core doesn't skip
calling the system sleep callbacks, no matter if the device is runtime
suspended or not.

In cases of an i2c slave device, the above situation is triggered.
Unfortunate, this also breaks the assumption that the i2c device is always
runtime resumed, whenever the dw_i2c_plat_suspend() callback is being
invoked, which then leads to a regression.

More precisely, dw_i2c_plat_suspend() then calls clk_core_disable() and
clk_core_unprepare(), for an already disabled/unprepared clock, leading to
complaints about clocks calls being wrongly balanced.

In cases when the i2c device is attached to the ACPI PM domain, the problem
doesn't occur. That's because ACPI's ->suspend() callback, assigned to
acpi_subsys_suspend(), calls pm_runtime_resume() for the i2c device.

To make a simple fix for this regression, let's runtime resume the i2c
device in the ->prepare() callback, assigned to dw_i2c_plat_prepare(). This
prevents the direct_complete path from being executed by the PM core and
guarantees the dw_i2c_plat_suspend() is called with the i2c device always
being runtime resumed.

Of course this change is suboptimal, because to always force a runtime
resume of the i2c device in ->prepare() is a waste, especially in those
cases when it could have remained runtime suspended during the entire
system sleep sequence. However, to accomplish that behaviour a bigger
change is needed, so defer that to future changes not applicable as fixes
or for stable.

Fixes: 8503ff166504 ("i2c: designware: Avoid unnecessary resuming...")
Cc: stable@vger.linux.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
 drivers/i2c/busses/i2c-designware-platdrv.c | 11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

Comments

Rafael J. Wysocki June 21, 2017, 11:31 p.m. | #1
On Wed, Jun 21, 2017 at 9:21 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> The commit 8503ff166504 ("i2c: designware: Avoid unnecessary resuming
> during system suspend"), may suggest to the PM core to try out the so
> called direct_complete path for system sleep. In this path, the PM core
> treats a runtime suspended device as it's already in a proper low power
> state for system sleep, which makes it skip calling the system sleep
> callbacks for the device, except for the ->prepare() and the ->complete()
> callback.
>
> Moreover, under certain circumstances the PM core may unset the
> direct_complete flag for a parent device, in case its child device are
> being system suspended before. In other words, the PM core doesn't skip
> calling the system sleep callbacks, no matter if the device is runtime
> suspended or not.
>
> In cases of an i2c slave device, the above situation is triggered.
> Unfortunate, this also breaks the assumption that the i2c device is always
> runtime resumed, whenever the dw_i2c_plat_suspend() callback is being
> invoked, which then leads to a regression.
>
> More precisely, dw_i2c_plat_suspend() then calls clk_core_disable() and
> clk_core_unprepare(), for an already disabled/unprepared clock, leading to
> complaints about clocks calls being wrongly balanced.
>
> In cases when the i2c device is attached to the ACPI PM domain, the problem
> doesn't occur. That's because ACPI's ->suspend() callback, assigned to
> acpi_subsys_suspend(), calls pm_runtime_resume() for the i2c device.

Which really is expected to happen, so direct_complete should only be
used along with the ACPI PM domain in this case.

Moreover, in the ACPI PM domain case acpi_subsys_prepare() is supposed
to do the right thing without dw_i2c_plat_prepare() and the return
value of the latter will be ignored anyway, so dw_i2c_plat_prepare()
will only have effect without ACPI PM domain AFAICS.

It looks like commit 8503ff166504 is entirely misguided.

Thanks,
Rafael
Mika Westerberg June 22, 2017, 10:49 a.m. | #2
On Thu, Jun 22, 2017 at 01:31:51AM +0200, Rafael J. Wysocki wrote:
> On Wed, Jun 21, 2017 at 9:21 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > The commit 8503ff166504 ("i2c: designware: Avoid unnecessary resuming
> > during system suspend"), may suggest to the PM core to try out the so
> > called direct_complete path for system sleep. In this path, the PM core
> > treats a runtime suspended device as it's already in a proper low power
> > state for system sleep, which makes it skip calling the system sleep
> > callbacks for the device, except for the ->prepare() and the ->complete()
> > callback.
> >
> > Moreover, under certain circumstances the PM core may unset the
> > direct_complete flag for a parent device, in case its child device are
> > being system suspended before. In other words, the PM core doesn't skip
> > calling the system sleep callbacks, no matter if the device is runtime
> > suspended or not.
> >
> > In cases of an i2c slave device, the above situation is triggered.
> > Unfortunate, this also breaks the assumption that the i2c device is always
> > runtime resumed, whenever the dw_i2c_plat_suspend() callback is being
> > invoked, which then leads to a regression.
> >
> > More precisely, dw_i2c_plat_suspend() then calls clk_core_disable() and
> > clk_core_unprepare(), for an already disabled/unprepared clock, leading to
> > complaints about clocks calls being wrongly balanced.
> >
> > In cases when the i2c device is attached to the ACPI PM domain, the problem
> > doesn't occur. That's because ACPI's ->suspend() callback, assigned to
> > acpi_subsys_suspend(), calls pm_runtime_resume() for the i2c device.
> 
> Which really is expected to happen, so direct_complete should only be
> used along with the ACPI PM domain in this case.
> 
> Moreover, in the ACPI PM domain case acpi_subsys_prepare() is supposed
> to do the right thing without dw_i2c_plat_prepare() and the return
> value of the latter will be ignored anyway, so dw_i2c_plat_prepare()
> will only have effect without ACPI PM domain AFAICS.
> 
> It looks like commit 8503ff166504 is entirely misguided.

Indeed it is. At the time I suggested that change I did not really
understand how the direct complete is supposed to be used :-/

Thanks Ulf for taking care of this!

I tested this series on Dell XPS 9350 which has touch panel connected to
I2C and suspend/resume still works fine and I can see the controller
going to D3 when the touch panel is idle.

I can perform more comprehensive testing next week.
Jarkko Nikula June 22, 2017, 11:16 a.m. | #3
On 06/22/2017 01:49 PM, Mika Westerberg wrote:
> On Thu, Jun 22, 2017 at 01:31:51AM +0200, Rafael J. Wysocki wrote:
>> On Wed, Jun 21, 2017 at 9:21 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> Thanks Ulf for taking care of this!
>
Indeed!

> I tested this series on Dell XPS 9350 which has touch panel connected to
> I2C and suspend/resume still works fine and I can see the controller
> going to D3 when the touch panel is idle.
>
> I can perform more comprehensive testing next week.
>
Unfortunately I'm seeing interrupt storm during suspend/resume on 
platform using PM domain from drivers/acpi/acpi_lpss.c straight after 
this patch. Maybe some timing related as I see it only if I have debug 
messages on (i2c_designware_core.dyndbg=+p). But it occurs only after 
this patch.

Have to dig more deeply next week.
Rafael J. Wysocki June 22, 2017, 2:41 p.m. | #4
On Thursday, June 22, 2017 01:49:33 PM Mika Westerberg wrote:
> On Thu, Jun 22, 2017 at 01:31:51AM +0200, Rafael J. Wysocki wrote:
> > On Wed, Jun 21, 2017 at 9:21 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > The commit 8503ff166504 ("i2c: designware: Avoid unnecessary resuming
> > > during system suspend"), may suggest to the PM core to try out the so
> > > called direct_complete path for system sleep. In this path, the PM core
> > > treats a runtime suspended device as it's already in a proper low power
> > > state for system sleep, which makes it skip calling the system sleep
> > > callbacks for the device, except for the ->prepare() and the ->complete()
> > > callback.
> > >
> > > Moreover, under certain circumstances the PM core may unset the
> > > direct_complete flag for a parent device, in case its child device are
> > > being system suspended before. In other words, the PM core doesn't skip
> > > calling the system sleep callbacks, no matter if the device is runtime
> > > suspended or not.
> > >
> > > In cases of an i2c slave device, the above situation is triggered.
> > > Unfortunate, this also breaks the assumption that the i2c device is always
> > > runtime resumed, whenever the dw_i2c_plat_suspend() callback is being
> > > invoked, which then leads to a regression.
> > >
> > > More precisely, dw_i2c_plat_suspend() then calls clk_core_disable() and
> > > clk_core_unprepare(), for an already disabled/unprepared clock, leading to
> > > complaints about clocks calls being wrongly balanced.
> > >
> > > In cases when the i2c device is attached to the ACPI PM domain, the problem
> > > doesn't occur. That's because ACPI's ->suspend() callback, assigned to
> > > acpi_subsys_suspend(), calls pm_runtime_resume() for the i2c device.
> > 
> > Which really is expected to happen, so direct_complete should only be
> > used along with the ACPI PM domain in this case.
> > 
> > Moreover, in the ACPI PM domain case acpi_subsys_prepare() is supposed
> > to do the right thing without dw_i2c_plat_prepare() and the return
> > value of the latter will be ignored anyway, so dw_i2c_plat_prepare()
> > will only have effect without ACPI PM domain AFAICS.
> > 
> > It looks like commit 8503ff166504 is entirely misguided.
> 
> Indeed it is. At the time I suggested that change I did not really
> understand how the direct complete is supposed to be used :-/

So can we go for a full revert, please, and then fix up things properly?

Thanks,
Rafael
Ulf Hansson June 22, 2017, 9:37 p.m. | #5
On 22 June 2017 at 16:41, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Thursday, June 22, 2017 01:49:33 PM Mika Westerberg wrote:
>> On Thu, Jun 22, 2017 at 01:31:51AM +0200, Rafael J. Wysocki wrote:
>> > On Wed, Jun 21, 2017 at 9:21 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>> > > The commit 8503ff166504 ("i2c: designware: Avoid unnecessary resuming
>> > > during system suspend"), may suggest to the PM core to try out the so
>> > > called direct_complete path for system sleep. In this path, the PM core
>> > > treats a runtime suspended device as it's already in a proper low power
>> > > state for system sleep, which makes it skip calling the system sleep
>> > > callbacks for the device, except for the ->prepare() and the ->complete()
>> > > callback.
>> > >
>> > > Moreover, under certain circumstances the PM core may unset the
>> > > direct_complete flag for a parent device, in case its child device are
>> > > being system suspended before. In other words, the PM core doesn't skip
>> > > calling the system sleep callbacks, no matter if the device is runtime
>> > > suspended or not.
>> > >
>> > > In cases of an i2c slave device, the above situation is triggered.
>> > > Unfortunate, this also breaks the assumption that the i2c device is always
>> > > runtime resumed, whenever the dw_i2c_plat_suspend() callback is being
>> > > invoked, which then leads to a regression.
>> > >
>> > > More precisely, dw_i2c_plat_suspend() then calls clk_core_disable() and
>> > > clk_core_unprepare(), for an already disabled/unprepared clock, leading to
>> > > complaints about clocks calls being wrongly balanced.
>> > >
>> > > In cases when the i2c device is attached to the ACPI PM domain, the problem
>> > > doesn't occur. That's because ACPI's ->suspend() callback, assigned to
>> > > acpi_subsys_suspend(), calls pm_runtime_resume() for the i2c device.
>> >
>> > Which really is expected to happen, so direct_complete should only be
>> > used along with the ACPI PM domain in this case.
>> >
>> > Moreover, in the ACPI PM domain case acpi_subsys_prepare() is supposed
>> > to do the right thing without dw_i2c_plat_prepare() and the return
>> > value of the latter will be ignored anyway, so dw_i2c_plat_prepare()
>> > will only have effect without ACPI PM domain AFAICS.
>> >
>> > It looks like commit 8503ff166504 is entirely misguided.
>>
>> Indeed it is. At the time I suggested that change I did not really
>> understand how the direct complete is supposed to be used :-/
>
> So can we go for a full revert, please, and then fix up things properly?

Unfortunate I think a revert is going to make it worse, for the non ACPI case.

Because in that case, unless I am reading the code wrong, I think when
the device is runtime suspended during system sleep, then the system
suspend callback will also be called (because the direct_complete
isn't run), again causing clock unbalance issues.

This makes it worse in that sense, that then you don't even need an
i2c slave to trigger the problem.

Kind regards
Uffe
Rafael J. Wysocki June 22, 2017, 10:01 p.m. | #6
On Thu, Jun 22, 2017 at 11:37 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 22 June 2017 at 16:41, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> On Thursday, June 22, 2017 01:49:33 PM Mika Westerberg wrote:
>>> On Thu, Jun 22, 2017 at 01:31:51AM +0200, Rafael J. Wysocki wrote:
>>> > On Wed, Jun 21, 2017 at 9:21 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>> > > The commit 8503ff166504 ("i2c: designware: Avoid unnecessary resuming
>>> > > during system suspend"), may suggest to the PM core to try out the so
>>> > > called direct_complete path for system sleep. In this path, the PM core
>>> > > treats a runtime suspended device as it's already in a proper low power
>>> > > state for system sleep, which makes it skip calling the system sleep
>>> > > callbacks for the device, except for the ->prepare() and the ->complete()
>>> > > callback.
>>> > >
>>> > > Moreover, under certain circumstances the PM core may unset the
>>> > > direct_complete flag for a parent device, in case its child device are
>>> > > being system suspended before. In other words, the PM core doesn't skip
>>> > > calling the system sleep callbacks, no matter if the device is runtime
>>> > > suspended or not.
>>> > >
>>> > > In cases of an i2c slave device, the above situation is triggered.
>>> > > Unfortunate, this also breaks the assumption that the i2c device is always
>>> > > runtime resumed, whenever the dw_i2c_plat_suspend() callback is being
>>> > > invoked, which then leads to a regression.
>>> > >
>>> > > More precisely, dw_i2c_plat_suspend() then calls clk_core_disable() and
>>> > > clk_core_unprepare(), for an already disabled/unprepared clock, leading to
>>> > > complaints about clocks calls being wrongly balanced.
>>> > >
>>> > > In cases when the i2c device is attached to the ACPI PM domain, the problem
>>> > > doesn't occur. That's because ACPI's ->suspend() callback, assigned to
>>> > > acpi_subsys_suspend(), calls pm_runtime_resume() for the i2c device.
>>> >
>>> > Which really is expected to happen, so direct_complete should only be
>>> > used along with the ACPI PM domain in this case.
>>> >
>>> > Moreover, in the ACPI PM domain case acpi_subsys_prepare() is supposed
>>> > to do the right thing without dw_i2c_plat_prepare() and the return
>>> > value of the latter will be ignored anyway, so dw_i2c_plat_prepare()
>>> > will only have effect without ACPI PM domain AFAICS.
>>> >
>>> > It looks like commit 8503ff166504 is entirely misguided.
>>>
>>> Indeed it is. At the time I suggested that change I did not really
>>> understand how the direct complete is supposed to be used :-/
>>
>> So can we go for a full revert, please, and then fix up things properly?
>
> Unfortunate I think a revert is going to make it worse, for the non ACPI case.
>
> Because in that case, unless I am reading the code wrong, I think when
> the device is runtime suspended during system sleep, then the system
> suspend callback will also be called (because the direct_complete
> isn't run), again causing clock unbalance issues.
>
> This makes it worse in that sense, that then you don't even need an
> i2c slave to trigger the problem.

In that case your changelog is a bit misleading.

It looks like the commit in question attempted to fix exactly this
issue, but it failed, so it should be replaced with something else
which is what your patch is effectively doing.

IMO you should describe the original problem, explain why that commit
is not sufficient to fix it and then describe the final fix.

Anyway, after reading the changelog it should be clear that things
were broken before the commit in question.

And BTW I'm not really sure how the rest of the series is related to this?

Thanks,
Rafael
Ulf Hansson June 26, 2017, 4:49 p.m. | #7
On 23 June 2017 at 00:01, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Thu, Jun 22, 2017 at 11:37 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>> On 22 June 2017 at 16:41, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>> On Thursday, June 22, 2017 01:49:33 PM Mika Westerberg wrote:
>>>> On Thu, Jun 22, 2017 at 01:31:51AM +0200, Rafael J. Wysocki wrote:
>>>> > On Wed, Jun 21, 2017 at 9:21 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>>> > > The commit 8503ff166504 ("i2c: designware: Avoid unnecessary resuming
>>>> > > during system suspend"), may suggest to the PM core to try out the so
>>>> > > called direct_complete path for system sleep. In this path, the PM core
>>>> > > treats a runtime suspended device as it's already in a proper low power
>>>> > > state for system sleep, which makes it skip calling the system sleep
>>>> > > callbacks for the device, except for the ->prepare() and the ->complete()
>>>> > > callback.
>>>> > >
>>>> > > Moreover, under certain circumstances the PM core may unset the
>>>> > > direct_complete flag for a parent device, in case its child device are
>>>> > > being system suspended before. In other words, the PM core doesn't skip
>>>> > > calling the system sleep callbacks, no matter if the device is runtime
>>>> > > suspended or not.
>>>> > >
>>>> > > In cases of an i2c slave device, the above situation is triggered.
>>>> > > Unfortunate, this also breaks the assumption that the i2c device is always
>>>> > > runtime resumed, whenever the dw_i2c_plat_suspend() callback is being
>>>> > > invoked, which then leads to a regression.
>>>> > >
>>>> > > More precisely, dw_i2c_plat_suspend() then calls clk_core_disable() and
>>>> > > clk_core_unprepare(), for an already disabled/unprepared clock, leading to
>>>> > > complaints about clocks calls being wrongly balanced.
>>>> > >
>>>> > > In cases when the i2c device is attached to the ACPI PM domain, the problem
>>>> > > doesn't occur. That's because ACPI's ->suspend() callback, assigned to
>>>> > > acpi_subsys_suspend(), calls pm_runtime_resume() for the i2c device.
>>>> >
>>>> > Which really is expected to happen, so direct_complete should only be
>>>> > used along with the ACPI PM domain in this case.
>>>> >
>>>> > Moreover, in the ACPI PM domain case acpi_subsys_prepare() is supposed
>>>> > to do the right thing without dw_i2c_plat_prepare() and the return
>>>> > value of the latter will be ignored anyway, so dw_i2c_plat_prepare()
>>>> > will only have effect without ACPI PM domain AFAICS.
>>>> >
>>>> > It looks like commit 8503ff166504 is entirely misguided.
>>>>
>>>> Indeed it is. At the time I suggested that change I did not really
>>>> understand how the direct complete is supposed to be used :-/
>>>
>>> So can we go for a full revert, please, and then fix up things properly?
>>
>> Unfortunate I think a revert is going to make it worse, for the non ACPI case.
>>
>> Because in that case, unless I am reading the code wrong, I think when
>> the device is runtime suspended during system sleep, then the system
>> suspend callback will also be called (because the direct_complete
>> isn't run), again causing clock unbalance issues.
>>
>> This makes it worse in that sense, that then you don't even need an
>> i2c slave to trigger the problem.
>
> In that case your changelog is a bit misleading.

Yeah, it is! I didn't realize that until I fully investigated the revert option.

>
> It looks like the commit in question attempted to fix exactly this
> issue, but it failed, so it should be replaced with something else
> which is what your patch is effectively doing.
>
> IMO you should describe the original problem, explain why that commit
> is not sufficient to fix it and then describe the final fix.
>
> Anyway, after reading the changelog it should be clear that things
> were broken before the commit in question.
>
> And BTW I'm not really sure how the rest of the series is related to this?

Going back in history, I realize the system sleep support in this
driver has been broken even before the commit $subject patch intends
to fix.
However it has been working fine for the ACPI case, because of how the
ACPI PM domain manages its devices during system sleep.

The commit in question, adds an improvement to the driver, because it
enables the direct_complete path. For ACPI, that was already working,
but not for the other cases. So to be able to support the similar
improvement as the direct_complete path offers, as that isn't working
for this driver, I tried out using the runtime PM centric path
instead. That is what the rest of the changes in this series takes
care of.

Now, as the system sleep support is broken I wanted to make a simple
fix for that first, via $subject patch. I guess what makes this a bit
confusing is that I shouldn't point to a certain commit, but rather
just add a stable tag and update the changelog accordingly.

Kind regards
Uffe
Grygorii Strashko June 26, 2017, 7:39 p.m. | #8
On 06/26/2017 11:49 AM, Ulf Hansson wrote:
> On 23 June 2017 at 00:01, Rafael J. Wysocki <rafael@kernel.org> wrote:
>> On Thu, Jun 22, 2017 at 11:37 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>> On 22 June 2017 at 16:41, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>>> On Thursday, June 22, 2017 01:49:33 PM Mika Westerberg wrote:
>>>>> On Thu, Jun 22, 2017 at 01:31:51AM +0200, Rafael J. Wysocki wrote:
>>>>>> On Wed, Jun 21, 2017 at 9:21 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>>>>>> The commit 8503ff166504 ("i2c: designware: Avoid unnecessary resuming
>>>>>>> during system suspend"), may suggest to the PM core to try out the so
>>>>>>> called direct_complete path for system sleep. In this path, the PM core
>>>>>>> treats a runtime suspended device as it's already in a proper low power
>>>>>>> state for system sleep, which makes it skip calling the system sleep
>>>>>>> callbacks for the device, except for the ->prepare() and the ->complete()
>>>>>>> callback.
>>>>>>>
>>>>>>> Moreover, under certain circumstances the PM core may unset the
>>>>>>> direct_complete flag for a parent device, in case its child device are
>>>>>>> being system suspended before. In other words, the PM core doesn't skip
>>>>>>> calling the system sleep callbacks, no matter if the device is runtime
>>>>>>> suspended or not.
>>>>>>>
>>>>>>> In cases of an i2c slave device, the above situation is triggered.
>>>>>>> Unfortunate, this also breaks the assumption that the i2c device is always
>>>>>>> runtime resumed, whenever the dw_i2c_plat_suspend() callback is being
>>>>>>> invoked, which then leads to a regression.
>>>>>>>
>>>>>>> More precisely, dw_i2c_plat_suspend() then calls clk_core_disable() and
>>>>>>> clk_core_unprepare(), for an already disabled/unprepared clock, leading to
>>>>>>> complaints about clocks calls being wrongly balanced.
>>>>>>>
>>>>>>> In cases when the i2c device is attached to the ACPI PM domain, the problem
>>>>>>> doesn't occur. That's because ACPI's ->suspend() callback, assigned to
>>>>>>> acpi_subsys_suspend(), calls pm_runtime_resume() for the i2c device.
>>>>>>
>>>>>> Which really is expected to happen, so direct_complete should only be
>>>>>> used along with the ACPI PM domain in this case.
>>>>>>
>>>>>> Moreover, in the ACPI PM domain case acpi_subsys_prepare() is supposed
>>>>>> to do the right thing without dw_i2c_plat_prepare() and the return
>>>>>> value of the latter will be ignored anyway, so dw_i2c_plat_prepare()
>>>>>> will only have effect without ACPI PM domain AFAICS.
>>>>>>
>>>>>> It looks like commit 8503ff166504 is entirely misguided.
>>>>>
>>>>> Indeed it is. At the time I suggested that change I did not really
>>>>> understand how the direct complete is supposed to be used :-/
>>>>
>>>> So can we go for a full revert, please, and then fix up things properly?
>>>
>>> Unfortunate I think a revert is going to make it worse, for the non ACPI case.
>>>
>>> Because in that case, unless I am reading the code wrong, I think when
>>> the device is runtime suspended during system sleep, then the system
>>> suspend callback will also be called (because the direct_complete
>>> isn't run), again causing clock unbalance issues.
>>>
>>> This makes it worse in that sense, that then you don't even need an
>>> i2c slave to trigger the problem.
>>
>> In that case your changelog is a bit misleading.
> 
> Yeah, it is! I didn't realize that until I fully investigated the revert option.
> 
>>
>> It looks like the commit in question attempted to fix exactly this
>> issue, but it failed, so it should be replaced with something else
>> which is what your patch is effectively doing.
>>
>> IMO you should describe the original problem, explain why that commit
>> is not sufficient to fix it and then describe the final fix.
>>
>> Anyway, after reading the changelog it should be clear that things
>> were broken before the commit in question.
>>
>> And BTW I'm not really sure how the rest of the series is related to this?
> 
> Going back in history, I realize the system sleep support in this
> driver has been broken even before the commit $subject patch intends
> to fix.
> However it has been working fine for the ACPI case, because of how the
> ACPI PM domain manages its devices during system sleep.
> 
> The commit in question, adds an improvement to the driver, because it
> enables the direct_complete path. For ACPI, that was already working,
> but not for the other cases. So to be able to support the similar
> improvement as the direct_complete path offers, as that isn't working
> for this driver, I tried out using the runtime PM centric path
> instead. That is what the rest of the changes in this series takes
> care of.
> 
> Now, as the system sleep support is broken I wanted to make a simple
> fix for that first, via $subject patch. I guess what makes this a bit
> confusing is that I shouldn't point to a certain commit, but rather
> just add a stable tag and update the changelog accordingly.
> 

Wouldn't it fix suspend for this driver if you will just replace 
dw_i2c_plat_suspend() with pm_runtime_force_suspend() in SET_SYSTEM_SLEEP_PM_OPS() as
you've done in patch 9?

And, I think, direct_complete path should still work after this also.
Rafael J. Wysocki June 26, 2017, 9:11 p.m. | #9
On Mon, Jun 26, 2017 at 9:39 PM, Grygorii Strashko
<grygorii.strashko@ti.com> wrote:
> On 06/26/2017 11:49 AM, Ulf Hansson wrote:

[cut]

>>>> This makes it worse in that sense, that then you don't even need an
>>>> i2c slave to trigger the problem.
>>>
>>> In that case your changelog is a bit misleading.
>>
>> Yeah, it is! I didn't realize that until I fully investigated the revert option.
>>
>>>
>>> It looks like the commit in question attempted to fix exactly this
>>> issue, but it failed, so it should be replaced with something else
>>> which is what your patch is effectively doing.
>>>
>>> IMO you should describe the original problem, explain why that commit
>>> is not sufficient to fix it and then describe the final fix.
>>>
>>> Anyway, after reading the changelog it should be clear that things
>>> were broken before the commit in question.
>>>
>>> And BTW I'm not really sure how the rest of the series is related to this?
>>
>> Going back in history, I realize the system sleep support in this
>> driver has been broken even before the commit $subject patch intends
>> to fix.
>> However it has been working fine for the ACPI case, because of how the
>> ACPI PM domain manages its devices during system sleep.
>>
>> The commit in question, adds an improvement to the driver, because it
>> enables the direct_complete path. For ACPI, that was already working,
>> but not for the other cases. So to be able to support the similar
>> improvement as the direct_complete path offers, as that isn't working
>> for this driver, I tried out using the runtime PM centric path
>> instead. That is what the rest of the changes in this series takes
>> care of.
>>
>> Now, as the system sleep support is broken I wanted to make a simple
>> fix for that first, via $subject patch. I guess what makes this a bit
>> confusing is that I shouldn't point to a certain commit, but rather
>> just add a stable tag and update the changelog accordingly.

I agree, modulo the below.

>
> Wouldn't it fix suspend for this driver if you will just replace
> dw_i2c_plat_suspend() with pm_runtime_force_suspend() in SET_SYSTEM_SLEEP_PM_OPS() as
> you've done in patch 9?
>
> And, I think, direct_complete path should still work after this also.

That's a good point and I was about to mention that.

In any case, even if the pm_runtime_resume() added by the $subject
patch is necessary to start with, it could be added to the ->suspend
callback of the driver instead of the ->complete one, in which case
the ACPI path would not be affected by this change.

Thanks,
Rafael
Jarkko Nikula June 27, 2017, 7:33 a.m. | #10
On 06/27/2017 12:11 AM, Rafael J. Wysocki wrote:
> On Mon, Jun 26, 2017 at 9:39 PM, Grygorii Strashko
>> Wouldn't it fix suspend for this driver if you will just replace
>> dw_i2c_plat_suspend() with pm_runtime_force_suspend() in SET_SYSTEM_SLEEP_PM_OPS() as
>> you've done in patch 9?
>>
>> And, I think, direct_complete path should still work after this also.
>
> That's a good point and I was about to mention that.
>
Unfortunately it will cause regression to platforms using 
drivers/acpi/acpi_lpss.c power domain:

http://www.spinics.net/lists/linux-i2c/msg29079.html
Jarkko Nikula June 27, 2017, 7:55 a.m. | #11
On 06/22/2017 02:16 PM, Jarkko Nikula wrote:
> On 06/22/2017 01:49 PM, Mika Westerberg wrote:
>> On Thu, Jun 22, 2017 at 01:31:51AM +0200, Rafael J. Wysocki wrote:
>>> On Wed, Jun 21, 2017 at 9:21 PM, Ulf Hansson <ulf.hansson@linaro.org>
>>> wrote:
>> Thanks Ulf for taking care of this!
>>
> Indeed!
>
>> I tested this series on Dell XPS 9350 which has touch panel connected to
>> I2C and suspend/resume still works fine and I can see the controller
>> going to D3 when the touch panel is idle.
>>
>> I can perform more comprehensive testing next week.
>>
> Unfortunately I'm seeing interrupt storm during suspend/resume on
> platform using PM domain from drivers/acpi/acpi_lpss.c straight after
> this patch. Maybe some timing related as I see it only if I have debug
> messages on (i2c_designware_core.dyndbg=+p). But it occurs only after
> this patch.
>
Sorry the noise, this was bogus. That platform is doing this interrupt 
storm randomly and it occurs also without the patch.
Rafael J. Wysocki June 27, 2017, 3:25 p.m. | #12
On Mon, Jun 26, 2017 at 11:11 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Mon, Jun 26, 2017 at 9:39 PM, Grygorii Strashko
> <grygorii.strashko@ti.com> wrote:
>> On 06/26/2017 11:49 AM, Ulf Hansson wrote:
>
> [cut]
>
>>>>> This makes it worse in that sense, that then you don't even need an
>>>>> i2c slave to trigger the problem.
>>>>
>>>> In that case your changelog is a bit misleading.
>>>
>>> Yeah, it is! I didn't realize that until I fully investigated the revert option.
>>>
>>>>
>>>> It looks like the commit in question attempted to fix exactly this
>>>> issue, but it failed, so it should be replaced with something else
>>>> which is what your patch is effectively doing.
>>>>
>>>> IMO you should describe the original problem, explain why that commit
>>>> is not sufficient to fix it and then describe the final fix.
>>>>
>>>> Anyway, after reading the changelog it should be clear that things
>>>> were broken before the commit in question.
>>>>
>>>> And BTW I'm not really sure how the rest of the series is related to this?
>>>
>>> Going back in history, I realize the system sleep support in this
>>> driver has been broken even before the commit $subject patch intends
>>> to fix.
>>> However it has been working fine for the ACPI case, because of how the
>>> ACPI PM domain manages its devices during system sleep.
>>>
>>> The commit in question, adds an improvement to the driver, because it
>>> enables the direct_complete path. For ACPI, that was already working,
>>> but not for the other cases. So to be able to support the similar
>>> improvement as the direct_complete path offers, as that isn't working
>>> for this driver, I tried out using the runtime PM centric path
>>> instead. That is what the rest of the changes in this series takes
>>> care of.
>>>
>>> Now, as the system sleep support is broken I wanted to make a simple
>>> fix for that first, via $subject patch. I guess what makes this a bit
>>> confusing is that I shouldn't point to a certain commit, but rather
>>> just add a stable tag and update the changelog accordingly.
>
> I agree, modulo the below.
>
>>
>> Wouldn't it fix suspend for this driver if you will just replace
>> dw_i2c_plat_suspend() with pm_runtime_force_suspend() in SET_SYSTEM_SLEEP_PM_OPS() as
>> you've done in patch 9?
>>
>> And, I think, direct_complete path should still work after this also.
>
> That's a good point and I was about to mention that.
>
> In any case, even if the pm_runtime_resume() added by the $subject
> patch is necessary to start with, it could be added to the ->suspend

I meant ->resume, sorry.

> callback of the driver instead of the ->complete one, in which case
> the ACPI path would not be affected by this change.

Thanks,
Rafael
Ulf Hansson June 28, 2017, 2:01 p.m. | #13
On 27 June 2017 at 09:55, Jarkko Nikula <jarkko.nikula@linux.intel.com> wrote:
> On 06/22/2017 02:16 PM, Jarkko Nikula wrote:
>>
>> On 06/22/2017 01:49 PM, Mika Westerberg wrote:
>>>
>>> On Thu, Jun 22, 2017 at 01:31:51AM +0200, Rafael J. Wysocki wrote:
>>>>
>>>> On Wed, Jun 21, 2017 at 9:21 PM, Ulf Hansson <ulf.hansson@linaro.org>
>>>> wrote:
>>>
>>> Thanks Ulf for taking care of this!
>>>
>> Indeed!
>>
>>> I tested this series on Dell XPS 9350 which has touch panel connected to
>>> I2C and suspend/resume still works fine and I can see the controller
>>> going to D3 when the touch panel is idle.
>>>
>>> I can perform more comprehensive testing next week.
>>>
>> Unfortunately I'm seeing interrupt storm during suspend/resume on
>> platform using PM domain from drivers/acpi/acpi_lpss.c straight after
>> this patch. Maybe some timing related as I see it only if I have debug
>> messages on (i2c_designware_core.dyndbg=+p). But it occurs only after
>> this patch.
>>
> Sorry the noise, this was bogus. That platform is doing this interrupt storm
> randomly and it occurs also without the patch.


Okay, then it seems like we should go with $subject patch, although
allow me to update the changelog and post a new version.

Kind regards
Uffe
Ulf Hansson June 28, 2017, 2:31 p.m. | #14
On 26 June 2017 at 21:39, Grygorii Strashko <grygorii.strashko@ti.com> wrote:
>
>
> On 06/26/2017 11:49 AM, Ulf Hansson wrote:
>> On 23 June 2017 at 00:01, Rafael J. Wysocki <rafael@kernel.org> wrote:
>>> On Thu, Jun 22, 2017 at 11:37 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>>> On 22 June 2017 at 16:41, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>>>> On Thursday, June 22, 2017 01:49:33 PM Mika Westerberg wrote:
>>>>>> On Thu, Jun 22, 2017 at 01:31:51AM +0200, Rafael J. Wysocki wrote:
>>>>>>> On Wed, Jun 21, 2017 at 9:21 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>>>>>>> The commit 8503ff166504 ("i2c: designware: Avoid unnecessary resuming
>>>>>>>> during system suspend"), may suggest to the PM core to try out the so
>>>>>>>> called direct_complete path for system sleep. In this path, the PM core
>>>>>>>> treats a runtime suspended device as it's already in a proper low power
>>>>>>>> state for system sleep, which makes it skip calling the system sleep
>>>>>>>> callbacks for the device, except for the ->prepare() and the ->complete()
>>>>>>>> callback.
>>>>>>>>
>>>>>>>> Moreover, under certain circumstances the PM core may unset the
>>>>>>>> direct_complete flag for a parent device, in case its child device are
>>>>>>>> being system suspended before. In other words, the PM core doesn't skip
>>>>>>>> calling the system sleep callbacks, no matter if the device is runtime
>>>>>>>> suspended or not.
>>>>>>>>
>>>>>>>> In cases of an i2c slave device, the above situation is triggered.
>>>>>>>> Unfortunate, this also breaks the assumption that the i2c device is always
>>>>>>>> runtime resumed, whenever the dw_i2c_plat_suspend() callback is being
>>>>>>>> invoked, which then leads to a regression.
>>>>>>>>
>>>>>>>> More precisely, dw_i2c_plat_suspend() then calls clk_core_disable() and
>>>>>>>> clk_core_unprepare(), for an already disabled/unprepared clock, leading to
>>>>>>>> complaints about clocks calls being wrongly balanced.
>>>>>>>>
>>>>>>>> In cases when the i2c device is attached to the ACPI PM domain, the problem
>>>>>>>> doesn't occur. That's because ACPI's ->suspend() callback, assigned to
>>>>>>>> acpi_subsys_suspend(), calls pm_runtime_resume() for the i2c device.
>>>>>>>
>>>>>>> Which really is expected to happen, so direct_complete should only be
>>>>>>> used along with the ACPI PM domain in this case.
>>>>>>>
>>>>>>> Moreover, in the ACPI PM domain case acpi_subsys_prepare() is supposed
>>>>>>> to do the right thing without dw_i2c_plat_prepare() and the return
>>>>>>> value of the latter will be ignored anyway, so dw_i2c_plat_prepare()
>>>>>>> will only have effect without ACPI PM domain AFAICS.
>>>>>>>
>>>>>>> It looks like commit 8503ff166504 is entirely misguided.
>>>>>>
>>>>>> Indeed it is. At the time I suggested that change I did not really
>>>>>> understand how the direct complete is supposed to be used :-/
>>>>>
>>>>> So can we go for a full revert, please, and then fix up things properly?
>>>>
>>>> Unfortunate I think a revert is going to make it worse, for the non ACPI case.
>>>>
>>>> Because in that case, unless I am reading the code wrong, I think when
>>>> the device is runtime suspended during system sleep, then the system
>>>> suspend callback will also be called (because the direct_complete
>>>> isn't run), again causing clock unbalance issues.
>>>>
>>>> This makes it worse in that sense, that then you don't even need an
>>>> i2c slave to trigger the problem.
>>>
>>> In that case your changelog is a bit misleading.
>>
>> Yeah, it is! I didn't realize that until I fully investigated the revert option.
>>
>>>
>>> It looks like the commit in question attempted to fix exactly this
>>> issue, but it failed, so it should be replaced with something else
>>> which is what your patch is effectively doing.
>>>
>>> IMO you should describe the original problem, explain why that commit
>>> is not sufficient to fix it and then describe the final fix.
>>>
>>> Anyway, after reading the changelog it should be clear that things
>>> were broken before the commit in question.
>>>
>>> And BTW I'm not really sure how the rest of the series is related to this?
>>
>> Going back in history, I realize the system sleep support in this
>> driver has been broken even before the commit $subject patch intends
>> to fix.
>> However it has been working fine for the ACPI case, because of how the
>> ACPI PM domain manages its devices during system sleep.
>>
>> The commit in question, adds an improvement to the driver, because it
>> enables the direct_complete path. For ACPI, that was already working,
>> but not for the other cases. So to be able to support the similar
>> improvement as the direct_complete path offers, as that isn't working
>> for this driver, I tried out using the runtime PM centric path
>> instead. That is what the rest of the changes in this series takes
>> care of.
>>
>> Now, as the system sleep support is broken I wanted to make a simple
>> fix for that first, via $subject patch. I guess what makes this a bit
>> confusing is that I shouldn't point to a certain commit, but rather
>> just add a stable tag and update the changelog accordingly.
>>
>
> Wouldn't it fix suspend for this driver if you will just replace
> dw_i2c_plat_suspend() with pm_runtime_force_suspend() in SET_SYSTEM_SLEEP_PM_OPS() as
> you've done in patch 9?

For the non-ACPI case - yes.

For the ACPI case - no. Here we need to adapt the ACPI PM domain
first, as it currently doesn't support the runtime PM centric path for
system sleep.

>
> And, I think, direct_complete path should still work after this also.

I don't understand why we want or need that?

To me it would only make the code in the ACPI PM domain more
complicated, comparing to the changes I have posted in rest of this
series.

Kind regards
Uffe
Rafael J. Wysocki June 28, 2017, 2:51 p.m. | #15
On Wednesday, June 28, 2017 04:01:39 PM Ulf Hansson wrote:
> On 27 June 2017 at 09:55, Jarkko Nikula <jarkko.nikula@linux.intel.com> wrote:
> > On 06/22/2017 02:16 PM, Jarkko Nikula wrote:
> >>
> >> On 06/22/2017 01:49 PM, Mika Westerberg wrote:
> >>>
> >>> On Thu, Jun 22, 2017 at 01:31:51AM +0200, Rafael J. Wysocki wrote:
> >>>>
> >>>> On Wed, Jun 21, 2017 at 9:21 PM, Ulf Hansson <ulf.hansson@linaro.org>
> >>>> wrote:
> >>>
> >>> Thanks Ulf for taking care of this!
> >>>
> >> Indeed!
> >>
> >>> I tested this series on Dell XPS 9350 which has touch panel connected to
> >>> I2C and suspend/resume still works fine and I can see the controller
> >>> going to D3 when the touch panel is idle.
> >>>
> >>> I can perform more comprehensive testing next week.
> >>>
> >> Unfortunately I'm seeing interrupt storm during suspend/resume on
> >> platform using PM domain from drivers/acpi/acpi_lpss.c straight after
> >> this patch. Maybe some timing related as I see it only if I have debug
> >> messages on (i2c_designware_core.dyndbg=+p). But it occurs only after
> >> this patch.
> >>
> > Sorry the noise, this was bogus. That platform is doing this interrupt storm
> > randomly and it occurs also without the patch.
> 
> 
> Okay, then it seems like we should go with $subject patch, although
> allow me to update the changelog and post a new version.

But as I said, it would be better to do the pm_runtime_resume() in ->resume
(unless that breaks something), for two reasons.

The first reason is that ->complete is synchronous and ->resume can be done
in parallel with other devices which potentially saves time.  The second reason
is that it wouldn't interfere with direct_complete on systems where that
actually works.

Thanks,
Rafael
Ulf Hansson June 28, 2017, 3:06 p.m. | #16
On 28 June 2017 at 16:51, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Wednesday, June 28, 2017 04:01:39 PM Ulf Hansson wrote:
>> On 27 June 2017 at 09:55, Jarkko Nikula <jarkko.nikula@linux.intel.com> wrote:
>> > On 06/22/2017 02:16 PM, Jarkko Nikula wrote:
>> >>
>> >> On 06/22/2017 01:49 PM, Mika Westerberg wrote:
>> >>>
>> >>> On Thu, Jun 22, 2017 at 01:31:51AM +0200, Rafael J. Wysocki wrote:
>> >>>>
>> >>>> On Wed, Jun 21, 2017 at 9:21 PM, Ulf Hansson <ulf.hansson@linaro.org>
>> >>>> wrote:
>> >>>
>> >>> Thanks Ulf for taking care of this!
>> >>>
>> >> Indeed!
>> >>
>> >>> I tested this series on Dell XPS 9350 which has touch panel connected to
>> >>> I2C and suspend/resume still works fine and I can see the controller
>> >>> going to D3 when the touch panel is idle.
>> >>>
>> >>> I can perform more comprehensive testing next week.
>> >>>
>> >> Unfortunately I'm seeing interrupt storm during suspend/resume on
>> >> platform using PM domain from drivers/acpi/acpi_lpss.c straight after
>> >> this patch. Maybe some timing related as I see it only if I have debug
>> >> messages on (i2c_designware_core.dyndbg=+p). But it occurs only after
>> >> this patch.
>> >>
>> > Sorry the noise, this was bogus. That platform is doing this interrupt storm
>> > randomly and it occurs also without the patch.
>>
>>
>> Okay, then it seems like we should go with $subject patch, although
>> allow me to update the changelog and post a new version.
>
> But as I said, it would be better to do the pm_runtime_resume() in ->resume
> (unless that breaks something), for two reasons.
>
> The first reason is that ->complete is synchronous and ->resume can be done
> in parallel with other devices which potentially saves time.  The second reason
> is that it wouldn't interfere with direct_complete on systems where that
> actually works.

Yes, that's good points - and I don't see any obvious reasons to why
it shouldn't work.

Kind regards
Uffe
Grygorii Strashko June 28, 2017, 4:52 p.m. | #17
On 06/28/2017 09:31 AM, Ulf Hansson wrote:
> On 26 June 2017 at 21:39, Grygorii Strashko <grygorii.strashko@ti.com> wrote:
>>
>>
>> On 06/26/2017 11:49 AM, Ulf Hansson wrote:
>>> On 23 June 2017 at 00:01, Rafael J. Wysocki <rafael@kernel.org> wrote:
>>>> On Thu, Jun 22, 2017 at 11:37 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>>>> On 22 June 2017 at 16:41, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>>>>> On Thursday, June 22, 2017 01:49:33 PM Mika Westerberg wrote:
>>>>>>> On Thu, Jun 22, 2017 at 01:31:51AM +0200, Rafael J. Wysocki wrote:
>>>>>>>> On Wed, Jun 21, 2017 at 9:21 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>>>>>>>> The commit 8503ff166504 ("i2c: designware: Avoid unnecessary resuming
>>>>>>>>> during system suspend"), may suggest to the PM core to try out the so
>>>>>>>>> called direct_complete path for system sleep. In this path, the PM core
>>>>>>>>> treats a runtime suspended device as it's already in a proper low power
>>>>>>>>> state for system sleep, which makes it skip calling the system sleep
>>>>>>>>> callbacks for the device, except for the ->prepare() and the ->complete()
>>>>>>>>> callback.
>>>>>>>>>
>>>>>>>>> Moreover, under certain circumstances the PM core may unset the
>>>>>>>>> direct_complete flag for a parent device, in case its child device are
>>>>>>>>> being system suspended before. In other words, the PM core doesn't skip
>>>>>>>>> calling the system sleep callbacks, no matter if the device is runtime
>>>>>>>>> suspended or not.
>>>>>>>>>
>>>>>>>>> In cases of an i2c slave device, the above situation is triggered.
>>>>>>>>> Unfortunate, this also breaks the assumption that the i2c device is always
>>>>>>>>> runtime resumed, whenever the dw_i2c_plat_suspend() callback is being
>>>>>>>>> invoked, which then leads to a regression.
>>>>>>>>>
>>>>>>>>> More precisely, dw_i2c_plat_suspend() then calls clk_core_disable() and
>>>>>>>>> clk_core_unprepare(), for an already disabled/unprepared clock, leading to
>>>>>>>>> complaints about clocks calls being wrongly balanced.
>>>>>>>>>
>>>>>>>>> In cases when the i2c device is attached to the ACPI PM domain, the problem
>>>>>>>>> doesn't occur. That's because ACPI's ->suspend() callback, assigned to
>>>>>>>>> acpi_subsys_suspend(), calls pm_runtime_resume() for the i2c device.
>>>>>>>>
>>>>>>>> Which really is expected to happen, so direct_complete should only be
>>>>>>>> used along with the ACPI PM domain in this case.
>>>>>>>>
>>>>>>>> Moreover, in the ACPI PM domain case acpi_subsys_prepare() is supposed
>>>>>>>> to do the right thing without dw_i2c_plat_prepare() and the return
>>>>>>>> value of the latter will be ignored anyway, so dw_i2c_plat_prepare()
>>>>>>>> will only have effect without ACPI PM domain AFAICS.
>>>>>>>>
>>>>>>>> It looks like commit 8503ff166504 is entirely misguided.
>>>>>>>
>>>>>>> Indeed it is. At the time I suggested that change I did not really
>>>>>>> understand how the direct complete is supposed to be used :-/
>>>>>>
>>>>>> So can we go for a full revert, please, and then fix up things properly?
>>>>>
>>>>> Unfortunate I think a revert is going to make it worse, for the non ACPI case.
>>>>>
>>>>> Because in that case, unless I am reading the code wrong, I think when
>>>>> the device is runtime suspended during system sleep, then the system
>>>>> suspend callback will also be called (because the direct_complete
>>>>> isn't run), again causing clock unbalance issues.
>>>>>
>>>>> This makes it worse in that sense, that then you don't even need an
>>>>> i2c slave to trigger the problem.
>>>>
>>>> In that case your changelog is a bit misleading.
>>>
>>> Yeah, it is! I didn't realize that until I fully investigated the revert option.
>>>
>>>>
>>>> It looks like the commit in question attempted to fix exactly this
>>>> issue, but it failed, so it should be replaced with something else
>>>> which is what your patch is effectively doing.
>>>>
>>>> IMO you should describe the original problem, explain why that commit
>>>> is not sufficient to fix it and then describe the final fix.
>>>>
>>>> Anyway, after reading the changelog it should be clear that things
>>>> were broken before the commit in question.
>>>>
>>>> And BTW I'm not really sure how the rest of the series is related to this?
>>>
>>> Going back in history, I realize the system sleep support in this
>>> driver has been broken even before the commit $subject patch intends
>>> to fix.
>>> However it has been working fine for the ACPI case, because of how the
>>> ACPI PM domain manages its devices during system sleep.
>>>
>>> The commit in question, adds an improvement to the driver, because it
>>> enables the direct_complete path. For ACPI, that was already working,
>>> but not for the other cases. So to be able to support the similar
>>> improvement as the direct_complete path offers, as that isn't working
>>> for this driver, I tried out using the runtime PM centric path
>>> instead. That is what the rest of the changes in this series takes
>>> care of.
>>>
>>> Now, as the system sleep support is broken I wanted to make a simple
>>> fix for that first, via $subject patch. I guess what makes this a bit
>>> confusing is that I shouldn't point to a certain commit, but rather
>>> just add a stable tag and update the changelog accordingly.
>>>
>>
>> Wouldn't it fix suspend for this driver if you will just replace
>> dw_i2c_plat_suspend() with pm_runtime_force_suspend() in SET_SYSTEM_SLEEP_PM_OPS() as
>> you've done in patch 9?
> 
> For the non-ACPI case - yes.
> 
> For the ACPI case - no. Here we need to adapt the ACPI PM domain
> first, as it currently doesn't support the runtime PM centric path for
> system sleep.

Yeah. 

> 
>>
>> And, I think, direct_complete path should still work after this also.
> 
> I don't understand why we want or need that?
> 
> To me it would only make the code in the ACPI PM domain more
> complicated, comparing to the changes I have posted in rest of this
> series.
> 

I'm not sure about benefits with this particular driver - everything
depend on power on/off latencies, if they are big direct_complete make sense.

The main point of my comment was based on fact that dw_i2c driver uses
the same function for PM runtime and System suspend callbacks, but this is
definitely wrong (ACPI or not), because PM runtime and System suspend
are not synchronized.
Tao Wang Sept. 8, 2017, 3:23 a.m. | #18
This patch can fix the issue we found on hikey960 that i2c doesn't skip system suspend when it is runtime suspended.

在 2017/6/22 3:21, Ulf Hansson 写道:
> The commit 8503ff166504 ("i2c: designware: Avoid unnecessary resuming
> during system suspend"), may suggest to the PM core to try out the so
> called direct_complete path for system sleep. In this path, the PM core
> treats a runtime suspended device as it's already in a proper low power
> state for system sleep, which makes it skip calling the system sleep
> callbacks for the device, except for the ->prepare() and the ->complete()
> callback.
> 
> Moreover, under certain circumstances the PM core may unset the
> direct_complete flag for a parent device, in case its child device are
> being system suspended before. In other words, the PM core doesn't skip
> calling the system sleep callbacks, no matter if the device is runtime
> suspended or not.
> 
> In cases of an i2c slave device, the above situation is triggered.
> Unfortunate, this also breaks the assumption that the i2c device is always
> runtime resumed, whenever the dw_i2c_plat_suspend() callback is being
> invoked, which then leads to a regression.
> 
> More precisely, dw_i2c_plat_suspend() then calls clk_core_disable() and
> clk_core_unprepare(), for an already disabled/unprepared clock, leading to
> complaints about clocks calls being wrongly balanced.
> 
> In cases when the i2c device is attached to the ACPI PM domain, the problem
> doesn't occur. That's because ACPI's ->suspend() callback, assigned to
> acpi_subsys_suspend(), calls pm_runtime_resume() for the i2c device.
> 
> To make a simple fix for this regression, let's runtime resume the i2c
> device in the ->prepare() callback, assigned to dw_i2c_plat_prepare(). This
> prevents the direct_complete path from being executed by the PM core and
> guarantees the dw_i2c_plat_suspend() is called with the i2c device always
> being runtime resumed.
> 
> Of course this change is suboptimal, because to always force a runtime
> resume of the i2c device in ->prepare() is a waste, especially in those
> cases when it could have remained runtime suspended during the entire
> system sleep sequence. However, to accomplish that behaviour a bigger
> change is needed, so defer that to future changes not applicable as fixes
> or for stable.
> 
> Fixes: 8503ff166504 ("i2c: designware: Avoid unnecessary resuming...")
> Cc: stable@vger.linux.org
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> ---
>   drivers/i2c/busses/i2c-designware-platdrv.c | 11 ++---------
>   1 file changed, 2 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/i2c/busses/i2c-designware-platdrv.c b/drivers/i2c/busses/i2c-designware-platdrv.c
> index d1263b8..2b7fa75 100644
> --- a/drivers/i2c/busses/i2c-designware-platdrv.c
> +++ b/drivers/i2c/busses/i2c-designware-platdrv.c
> @@ -375,17 +375,11 @@ MODULE_DEVICE_TABLE(of, dw_i2c_of_match);
>   #ifdef CONFIG_PM_SLEEP
>   static int dw_i2c_plat_prepare(struct device *dev)
>   {
> -	return pm_runtime_suspended(dev);
> -}
> -
> -static void dw_i2c_plat_complete(struct device *dev)
> -{
> -	if (dev->power.direct_complete)
> -		pm_request_resume(dev);
> +	pm_runtime_resume(dev);
> +	return 0;
>   }
>   #else
>   #define dw_i2c_plat_prepare	NULL
> -#define dw_i2c_plat_complete	NULL
>   #endif
>   
>   #ifdef CONFIG_PM
> @@ -413,7 +407,6 @@ static int dw_i2c_plat_resume(struct device *dev)
>   
>   static const struct dev_pm_ops dw_i2c_dev_pm_ops = {
>   	.prepare = dw_i2c_plat_prepare,
> -	.complete = dw_i2c_plat_complete,
>   	SET_SYSTEM_SLEEP_PM_OPS(dw_i2c_plat_suspend, dw_i2c_plat_resume)
>   	SET_RUNTIME_PM_OPS(dw_i2c_plat_suspend, dw_i2c_plat_resume, NULL)
>   };
>
Ulf Hansson Sept. 8, 2017, 8:29 a.m. | #19
On 8 September 2017 at 05:23, Wangtao (Kevin, Kirin)
<kevin.wangtao@hisilicon.com> wrote:
>
> This patch can fix the issue we found on hikey960 that i2c doesn't skip
> system suspend when it is runtime suspended.

We decided to go with a slightly different version. You may try out
the below and see if that also fixes your problem.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a23318feeff662c8d25d21623daebdd2e55ec221

Kind regards
Uffe

>
>
> 在 2017/6/22 3:21, Ulf Hansson 写道:
>>
>> The commit 8503ff166504 ("i2c: designware: Avoid unnecessary resuming
>> during system suspend"), may suggest to the PM core to try out the so
>> called direct_complete path for system sleep. In this path, the PM core
>> treats a runtime suspended device as it's already in a proper low power
>> state for system sleep, which makes it skip calling the system sleep
>> callbacks for the device, except for the ->prepare() and the ->complete()
>> callback.
>>
>> Moreover, under certain circumstances the PM core may unset the
>> direct_complete flag for a parent device, in case its child device are
>> being system suspended before. In other words, the PM core doesn't skip
>> calling the system sleep callbacks, no matter if the device is runtime
>> suspended or not.
>>
>> In cases of an i2c slave device, the above situation is triggered.
>> Unfortunate, this also breaks the assumption that the i2c device is always
>> runtime resumed, whenever the dw_i2c_plat_suspend() callback is being
>> invoked, which then leads to a regression.
>>
>> More precisely, dw_i2c_plat_suspend() then calls clk_core_disable() and
>> clk_core_unprepare(), for an already disabled/unprepared clock, leading to
>> complaints about clocks calls being wrongly balanced.
>>
>> In cases when the i2c device is attached to the ACPI PM domain, the
>> problem
>> doesn't occur. That's because ACPI's ->suspend() callback, assigned to
>> acpi_subsys_suspend(), calls pm_runtime_resume() for the i2c device.
>>
>> To make a simple fix for this regression, let's runtime resume the i2c
>> device in the ->prepare() callback, assigned to dw_i2c_plat_prepare().
>> This
>> prevents the direct_complete path from being executed by the PM core and
>> guarantees the dw_i2c_plat_suspend() is called with the i2c device always
>> being runtime resumed.
>>
>> Of course this change is suboptimal, because to always force a runtime
>> resume of the i2c device in ->prepare() is a waste, especially in those
>> cases when it could have remained runtime suspended during the entire
>> system sleep sequence. However, to accomplish that behaviour a bigger
>> change is needed, so defer that to future changes not applicable as fixes
>> or for stable.
>>
>> Fixes: 8503ff166504 ("i2c: designware: Avoid unnecessary resuming...")
>> Cc: stable@vger.linux.org
>> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
>> ---
>>   drivers/i2c/busses/i2c-designware-platdrv.c | 11 ++---------
>>   1 file changed, 2 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/i2c/busses/i2c-designware-platdrv.c
>> b/drivers/i2c/busses/i2c-designware-platdrv.c
>> index d1263b8..2b7fa75 100644
>> --- a/drivers/i2c/busses/i2c-designware-platdrv.c
>> +++ b/drivers/i2c/busses/i2c-designware-platdrv.c
>> @@ -375,17 +375,11 @@ MODULE_DEVICE_TABLE(of, dw_i2c_of_match);
>>   #ifdef CONFIG_PM_SLEEP
>>   static int dw_i2c_plat_prepare(struct device *dev)
>>   {
>> -       return pm_runtime_suspended(dev);
>> -}
>> -
>> -static void dw_i2c_plat_complete(struct device *dev)
>> -{
>> -       if (dev->power.direct_complete)
>> -               pm_request_resume(dev);
>> +       pm_runtime_resume(dev);
>> +       return 0;
>>   }
>>   #else
>>   #define dw_i2c_plat_prepare   NULL
>> -#define dw_i2c_plat_complete   NULL
>>   #endif
>>     #ifdef CONFIG_PM
>> @@ -413,7 +407,6 @@ static int dw_i2c_plat_resume(struct device *dev)
>>     static const struct dev_pm_ops dw_i2c_dev_pm_ops = {
>>         .prepare = dw_i2c_plat_prepare,
>> -       .complete = dw_i2c_plat_complete,
>>         SET_SYSTEM_SLEEP_PM_OPS(dw_i2c_plat_suspend, dw_i2c_plat_resume)
>>         SET_RUNTIME_PM_OPS(dw_i2c_plat_suspend, dw_i2c_plat_resume, NULL)
>>   };
>>
>
Tao Wang Sept. 12, 2017, 9:44 a.m. | #20
OK, this patch can also fix the problem.
On 2017/9/8 16:29, Ulf Hansson wrote:
> On 8 September 2017 at 05:23, Wangtao (Kevin, Kirin)
> <kevin.wangtao@hisilicon.com> wrote:
>>
>> This patch can fix the issue we found on hikey960 that i2c doesn't skip
>> system suspend when it is runtime suspended.
> 
> We decided to go with a slightly different version. You may try out
> the below and see if that also fixes your problem.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a23318feeff662c8d25d21623daebdd2e55ec221
> 
> Kind regards
> Uffe
> 
>>
>>
>> 在 2017/6/22 3:21, Ulf Hansson 写道:
>>>
>>> The commit 8503ff166504 ("i2c: designware: Avoid unnecessary resuming
>>> during system suspend"), may suggest to the PM core to try out the so
>>> called direct_complete path for system sleep. In this path, the PM core
>>> treats a runtime suspended device as it's already in a proper low power
>>> state for system sleep, which makes it skip calling the system sleep
>>> callbacks for the device, except for the ->prepare() and the ->complete()
>>> callback.
>>>
>>> Moreover, under certain circumstances the PM core may unset the
>>> direct_complete flag for a parent device, in case its child device are
>>> being system suspended before. In other words, the PM core doesn't skip
>>> calling the system sleep callbacks, no matter if the device is runtime
>>> suspended or not.
>>>
>>> In cases of an i2c slave device, the above situation is triggered.
>>> Unfortunate, this also breaks the assumption that the i2c device is always
>>> runtime resumed, whenever the dw_i2c_plat_suspend() callback is being
>>> invoked, which then leads to a regression.
>>>
>>> More precisely, dw_i2c_plat_suspend() then calls clk_core_disable() and
>>> clk_core_unprepare(), for an already disabled/unprepared clock, leading to
>>> complaints about clocks calls being wrongly balanced.
>>>
>>> In cases when the i2c device is attached to the ACPI PM domain, the
>>> problem
>>> doesn't occur. That's because ACPI's ->suspend() callback, assigned to
>>> acpi_subsys_suspend(), calls pm_runtime_resume() for the i2c device.
>>>
>>> To make a simple fix for this regression, let's runtime resume the i2c
>>> device in the ->prepare() callback, assigned to dw_i2c_plat_prepare().
>>> This
>>> prevents the direct_complete path from being executed by the PM core and
>>> guarantees the dw_i2c_plat_suspend() is called with the i2c device always
>>> being runtime resumed.
>>>
>>> Of course this change is suboptimal, because to always force a runtime
>>> resume of the i2c device in ->prepare() is a waste, especially in those
>>> cases when it could have remained runtime suspended during the entire
>>> system sleep sequence. However, to accomplish that behaviour a bigger
>>> change is needed, so defer that to future changes not applicable as fixes
>>> or for stable.
>>>
>>> Fixes: 8503ff166504 ("i2c: designware: Avoid unnecessary resuming...")
>>> Cc: stable@vger.linux.org
>>> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
>>> ---
>>>    drivers/i2c/busses/i2c-designware-platdrv.c | 11 ++---------
>>>    1 file changed, 2 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/i2c/busses/i2c-designware-platdrv.c
>>> b/drivers/i2c/busses/i2c-designware-platdrv.c
>>> index d1263b8..2b7fa75 100644
>>> --- a/drivers/i2c/busses/i2c-designware-platdrv.c
>>> +++ b/drivers/i2c/busses/i2c-designware-platdrv.c
>>> @@ -375,17 +375,11 @@ MODULE_DEVICE_TABLE(of, dw_i2c_of_match);
>>>    #ifdef CONFIG_PM_SLEEP
>>>    static int dw_i2c_plat_prepare(struct device *dev)
>>>    {
>>> -       return pm_runtime_suspended(dev);
>>> -}
>>> -
>>> -static void dw_i2c_plat_complete(struct device *dev)
>>> -{
>>> -       if (dev->power.direct_complete)
>>> -               pm_request_resume(dev);
>>> +       pm_runtime_resume(dev);
>>> +       return 0;
>>>    }
>>>    #else
>>>    #define dw_i2c_plat_prepare   NULL
>>> -#define dw_i2c_plat_complete   NULL
>>>    #endif
>>>      #ifdef CONFIG_PM
>>> @@ -413,7 +407,6 @@ static int dw_i2c_plat_resume(struct device *dev)
>>>      static const struct dev_pm_ops dw_i2c_dev_pm_ops = {
>>>          .prepare = dw_i2c_plat_prepare,
>>> -       .complete = dw_i2c_plat_complete,
>>>          SET_SYSTEM_SLEEP_PM_OPS(dw_i2c_plat_suspend, dw_i2c_plat_resume)
>>>          SET_RUNTIME_PM_OPS(dw_i2c_plat_suspend, dw_i2c_plat_resume, NULL)
>>>    };
>>>
>>
> 
> .
>

Patch

diff --git a/drivers/i2c/busses/i2c-designware-platdrv.c b/drivers/i2c/busses/i2c-designware-platdrv.c
index d1263b8..2b7fa75 100644
--- a/drivers/i2c/busses/i2c-designware-platdrv.c
+++ b/drivers/i2c/busses/i2c-designware-platdrv.c
@@ -375,17 +375,11 @@  MODULE_DEVICE_TABLE(of, dw_i2c_of_match);
 #ifdef CONFIG_PM_SLEEP
 static int dw_i2c_plat_prepare(struct device *dev)
 {
-	return pm_runtime_suspended(dev);
-}
-
-static void dw_i2c_plat_complete(struct device *dev)
-{
-	if (dev->power.direct_complete)
-		pm_request_resume(dev);
+	pm_runtime_resume(dev);
+	return 0;
 }
 #else
 #define dw_i2c_plat_prepare	NULL
-#define dw_i2c_plat_complete	NULL
 #endif
 
 #ifdef CONFIG_PM
@@ -413,7 +407,6 @@  static int dw_i2c_plat_resume(struct device *dev)
 
 static const struct dev_pm_ops dw_i2c_dev_pm_ops = {
 	.prepare = dw_i2c_plat_prepare,
-	.complete = dw_i2c_plat_complete,
 	SET_SYSTEM_SLEEP_PM_OPS(dw_i2c_plat_suspend, dw_i2c_plat_resume)
 	SET_RUNTIME_PM_OPS(dw_i2c_plat_suspend, dw_i2c_plat_resume, NULL)
 };