diff mbox

[v2,4/5] i2c: designware-baytrail: Force the CPU to C1 state while holding the punit semaphore

Message ID 20161210141908.16470-4-hdegoede@redhat.com
State Superseded
Headers show

Commit Message

Hans de Goede Dec. 10, 2016, 2:19 p.m. UTC
On my cherrytrail tablet with axp288 pmic, just doing a bunch of repeated
reads from the pmic, e.g. "i2cdump -y 14 0x34" would lookup the tablet in
1 - 3 runs guaranteed.

This seems to be causes by the cpuidle / intel_idle driver trying to
change the C-state while we hold the punit bus semaphore, at which point
everything just hangs.

Avoid this by forcing the CPU to C1 before acquiring the punit bus
semaphore.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=109051
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
 drivers/i2c/busses/i2c-designware-baytrail.c | 12 ++++++++++++
 drivers/i2c/busses/i2c-designware-core.h     |  3 +++
 drivers/i2c/busses/i2c-designware-platdrv.c  |  3 +++
 3 files changed, 18 insertions(+)

Comments

Andy Shevchenko Dec. 10, 2016, 2:53 p.m. UTC | #1
+Cc: Len
Len, I think you would be interested by this.

Hans, thanks for the change! Most probably we will anticipate Len's ACK
on this one.

On Sat, 2016-12-10 at 15:19 +0100, Hans de Goede wrote:
> On my cherrytrail tablet with axp288 pmic, just doing a bunch of
> repeated
> reads from the pmic, e.g. "i2cdump -y 14 0x34" would lookup the tablet
> in
> 1 - 3 runs guaranteed.
> 
> This seems to be causes by the cpuidle / intel_idle driver trying to
> change the C-state while we hold the punit bus semaphore, at which
> point
> everything just hangs.
> 
> Avoid this by forcing the CPU to C1 before acquiring the punit bus
> semaphore.

Isn't it C0? C1 as far as I remember is halted state.

> @@ -33,6 +34,13 @@ static int get_sem(struct dw_i2c_dev *dev, u32
> *sem)
>  	u32 data;
>  	int ret;
>  
> +	/*
> +	 * Force CPU to C1 state, otherwise if the cpuidle /
> intel_idle
> +	 * driver tries to change the C state while we're holding the
> +	 * semaphore, the SoC hangs.

C0?

> +	 */
> +	pm_qos_update_request(&dev->pm_qos, 0);

C1 is when you set 1 here, right?

> platform_device *pdev)
>  	if (!dev->pm_runtime_disabled)
>  		pm_runtime_disable(&pdev->dev);

> +	if (dev->acquire_lock)
> +		pm_qos_remove_request(&dev->pm_qos);
> +

Perhaps you need to do this in -core.c. Otherwise you missed PCI case.
(Even with PCI enumerated host with ACPI-enabled firmware you may get
_SEM object present)

>  	return 0;
>  }
>
Hans de Goede Dec. 10, 2016, 7:33 p.m. UTC | #2
Hi,

On 10-12-16 15:53, Andy Shevchenko wrote:
> +Cc: Len
> Len, I think you would be interested by this.
>
> Hans, thanks for the change!

You're welcome I ended up comparing the code in
i2c-dw_i2c-Ported-punit-locking-patch-from-MCG-kerne.patch from:

https://github.com/01org/ProductionKernelQuilts/tree/master/uefi/cht-m1stable/patches

against the mainline code while I was trying to fix the maddening
problem of the entire SoC hanging more or less as soon
as I tried to use the pmic i2c bus and there I found
some fiddling with pm_qos which let to this patch.

> Most probably we will anticipate Len's ACK
> on this one.
>
> On Sat, 2016-12-10 at 15:19 +0100, Hans de Goede wrote:
>> On my cherrytrail tablet with axp288 pmic, just doing a bunch of
>> repeated
>> reads from the pmic, e.g. "i2cdump -y 14 0x34" would lookup the tablet
>> in
>> 1 - 3 runs guaranteed.
>>
>> This seems to be causes by the cpuidle / intel_idle driver trying to
>> change the C-state while we hold the punit bus semaphore, at which
>> point
>> everything just hangs.
>>
>> Avoid this by forcing the CPU to C1 before acquiring the punit bus
>> semaphore.
>
> Isn't it C0? C1 as far as I remember is halted state.

You're right, I will fix it.

>> @@ -33,6 +34,13 @@ static int get_sem(struct dw_i2c_dev *dev, u32
>> *sem)
>>  	u32 data;
>>  	int ret;
>>
>> +	/*
>> +	 * Force CPU to C1 state, otherwise if the cpuidle /
>> intel_idle
>> +	 * driver tries to change the C state while we're holding the
>> +	 * semaphore, the SoC hangs.
>
> C0?
>
>> +	 */
>> +	pm_qos_update_request(&dev->pm_qos, 0);
>
> C1 is when you set 1 here, right?

I believe so, yes.

>
>> platform_device *pdev)
>>  	if (!dev->pm_runtime_disabled)
>>  		pm_runtime_disable(&pdev->dev);
>
>> +	if (dev->acquire_lock)
>> +		pm_qos_remove_request(&dev->pm_qos);
>> +
>
> Perhaps you need to do this in -core.c. Otherwise you missed PCI case.
> (Even with PCI enumerated host with ACPI-enabled firmware you may get
> _SEM object present)

Currently only i2c-designware-plardrv.c calls i2c_dw_eval_lock_support()
which does the pm_qos_add_request, so I put it here to keep things
balanced.

Regards,

Hans
--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hans de Goede Dec. 10, 2016, 7:59 p.m. UTC | #3
Hi,

On 10-12-16 20:33, Hans de Goede wrote:
> Hi,
>
> On 10-12-16 15:53, Andy Shevchenko wrote:
>> +Cc: Len
>> Len, I think you would be interested by this.
>>
>> Hans, thanks for the change!
>
> You're welcome I ended up comparing the code in
> i2c-dw_i2c-Ported-punit-locking-patch-from-MCG-kerne.patch from:
>
> https://github.com/01org/ProductionKernelQuilts/tree/master/uefi/cht-m1stable/patches
>
> against the mainline code while I was trying to fix the maddening
> problem of the entire SoC hanging more or less as soon
> as I tried to use the pmic i2c bus and there I found
> some fiddling with pm_qos which let to this patch.
>
>> Most probably we will anticipate Len's ACK
>> on this one.
>>
>> On Sat, 2016-12-10 at 15:19 +0100, Hans de Goede wrote:
>>> On my cherrytrail tablet with axp288 pmic, just doing a bunch of
>>> repeated
>>> reads from the pmic, e.g. "i2cdump -y 14 0x34" would lookup the tablet
>>> in
>>> 1 - 3 runs guaranteed.
>>>
>>> This seems to be causes by the cpuidle / intel_idle driver trying to
>>> change the C-state while we hold the punit bus semaphore, at which
>>> point
>>> everything just hangs.
>>>
>>> Avoid this by forcing the CPU to C1 before acquiring the punit bus
>>> semaphore.
>>
>> Isn't it C0? C1 as far as I remember is halted state.
>
> You're right, I will fix it.

Correction, upon closer reading of the docs, we cannot disallow
the CPU to enter C1 / force it to either C0 or C1, what we can
disallow is for it to enter C6/C7. Which also makes sense wrt
this bug, since entering C6/C7 involves turning of the
CPU-core power-plane, which requires the punit to access the pmic.

So I've changes the text in both the commit msg and the comment
to: "Disallow the CPU to enter C6 or C7"

I still need to re-test (just to make sure I did not cause
any regressions) and then I'll send a v3.

Regards,

Hans



>>> @@ -33,6 +34,13 @@ static int get_sem(struct dw_i2c_dev *dev, u32
>>> *sem)
>>>      u32 data;
>>>      int ret;
>>>
>>> +    /*
>>> +     * Force CPU to C1 state, otherwise if the cpuidle /
>>> intel_idle
>>> +     * driver tries to change the C state while we're holding the
>>> +     * semaphore, the SoC hangs.
>>
>> C0?
>>
>>> +     */
>>> +    pm_qos_update_request(&dev->pm_qos, 0);
>>
>> C1 is when you set 1 here, right?
>
> I believe so, yes.
>
>>
>>> platform_device *pdev)
>>>      if (!dev->pm_runtime_disabled)
>>>          pm_runtime_disable(&pdev->dev);
>>
>>> +    if (dev->acquire_lock)
>>> +        pm_qos_remove_request(&dev->pm_qos);
>>> +
>>
>> Perhaps you need to do this in -core.c. Otherwise you missed PCI case.
>> (Even with PCI enumerated host with ACPI-enabled firmware you may get
>> _SEM object present)
>
> Currently only i2c-designware-plardrv.c calls i2c_dw_eval_lock_support()
> which does the pm_qos_add_request, so I put it here to keep things
> balanced.
>
> Regards,
>
> Hans
--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Len Brown Dec. 25, 2016, 6:31 p.m. UTC | #4
Is there a simple way to run a test to keep deep C-states
and instead disable part or all of i2c on this platform,
to see how much stability separating the two will buy us?

A lot of people are struggling w/ the stability of this platform,
and it would be great to make some progress on that.

thanks,
-Len



On Sat, Dec 10, 2016 at 2:59 PM, Hans de Goede <hdegoede@redhat.com> wrote:
> Hi,
>
>
> On 10-12-16 20:33, Hans de Goede wrote:
>>
>> Hi,
>>
>> On 10-12-16 15:53, Andy Shevchenko wrote:
>>>
>>> +Cc: Len
>>> Len, I think you would be interested by this.
>>>
>>> Hans, thanks for the change!
>>
>>
>> You're welcome I ended up comparing the code in
>> i2c-dw_i2c-Ported-punit-locking-patch-from-MCG-kerne.patch from:
>>
>>
>> https://github.com/01org/ProductionKernelQuilts/tree/master/uefi/cht-m1stable/patches
>>
>> against the mainline code while I was trying to fix the maddening
>> problem of the entire SoC hanging more or less as soon
>> as I tried to use the pmic i2c bus and there I found
>> some fiddling with pm_qos which let to this patch.
>>
>>> Most probably we will anticipate Len's ACK
>>> on this one.
>>>
>>> On Sat, 2016-12-10 at 15:19 +0100, Hans de Goede wrote:
>>>>
>>>> On my cherrytrail tablet with axp288 pmic, just doing a bunch of
>>>> repeated
>>>> reads from the pmic, e.g. "i2cdump -y 14 0x34" would lookup the tablet
>>>> in
>>>> 1 - 3 runs guaranteed.
>>>>
>>>> This seems to be causes by the cpuidle / intel_idle driver trying to
>>>> change the C-state while we hold the punit bus semaphore, at which
>>>> point
>>>> everything just hangs.
>>>>
>>>> Avoid this by forcing the CPU to C1 before acquiring the punit bus
>>>> semaphore.
>>>
>>>
>>> Isn't it C0? C1 as far as I remember is halted state.
>>
>>
>> You're right, I will fix it.
>
>
> Correction, upon closer reading of the docs, we cannot disallow
> the CPU to enter C1 / force it to either C0 or C1, what we can
> disallow is for it to enter C6/C7. Which also makes sense wrt
> this bug, since entering C6/C7 involves turning of the
> CPU-core power-plane, which requires the punit to access the pmic.
>
> So I've changes the text in both the commit msg and the comment
> to: "Disallow the CPU to enter C6 or C7"
>
> I still need to re-test (just to make sure I did not cause
> any regressions) and then I'll send a v3.
>
> Regards,
>
> Hans
>
>
>
>
>>>> @@ -33,6 +34,13 @@ static int get_sem(struct dw_i2c_dev *dev, u32
>>>> *sem)
>>>>      u32 data;
>>>>      int ret;
>>>>
>>>> +    /*
>>>> +     * Force CPU to C1 state, otherwise if the cpuidle /
>>>> intel_idle
>>>> +     * driver tries to change the C state while we're holding the
>>>> +     * semaphore, the SoC hangs.
>>>
>>>
>>> C0?
>>>
>>>> +     */
>>>> +    pm_qos_update_request(&dev->pm_qos, 0);
>>>
>>>
>>> C1 is when you set 1 here, right?
>>
>>
>> I believe so, yes.
>>
>>>
>>>> platform_device *pdev)
>>>>      if (!dev->pm_runtime_disabled)
>>>>          pm_runtime_disable(&pdev->dev);
>>>
>>>
>>>> +    if (dev->acquire_lock)
>>>> +        pm_qos_remove_request(&dev->pm_qos);
>>>> +
>>>
>>>
>>> Perhaps you need to do this in -core.c. Otherwise you missed PCI case.
>>> (Even with PCI enumerated host with ACPI-enabled firmware you may get
>>> _SEM object present)
>>
>>
>> Currently only i2c-designware-plardrv.c calls i2c_dw_eval_lock_support()
>> which does the pm_qos_add_request, so I put it here to keep things
>> balanced.
>>
>> Regards,
>>
>> Hans
diff mbox

Patch

diff --git a/drivers/i2c/busses/i2c-designware-baytrail.c b/drivers/i2c/busses/i2c-designware-baytrail.c
index a419777..4f10ebb 100644
--- a/drivers/i2c/busses/i2c-designware-baytrail.c
+++ b/drivers/i2c/busses/i2c-designware-baytrail.c
@@ -16,6 +16,7 @@ 
 #include <linux/acpi.h>
 #include <linux/i2c.h>
 #include <linux/interrupt.h>
+#include <linux/pm_qos.h>
 
 #include <asm/iosf_mbi.h>
 
@@ -33,6 +34,13 @@  static int get_sem(struct dw_i2c_dev *dev, u32 *sem)
 	u32 data;
 	int ret;
 
+	/*
+	 * Force CPU to C1 state, otherwise if the cpuidle / intel_idle
+	 * driver tries to change the C state while we're holding the
+	 * semaphore, the SoC hangs.
+	 */
+	pm_qos_update_request(&dev->pm_qos, 0);
+
 	ret = iosf_mbi_read(BT_MBI_UNIT_PMC, MBI_REG_READ, PUNIT_SEMAPHORE, &data);
 	if (ret) {
 		dev_err(dev->dev, "iosf failed to read punit semaphore\n");
@@ -56,6 +64,8 @@  static void reset_semaphore(struct dw_i2c_dev *dev)
 	data &= ~PUNIT_SEMAPHORE_BIT;
 	if (iosf_mbi_write(BT_MBI_UNIT_PMC, MBI_REG_WRITE, PUNIT_SEMAPHORE, data))
 		dev_err(dev->dev, "iosf failed to reset punit semaphore during write\n");
+
+	pm_qos_update_request(&dev->pm_qos, PM_QOS_DEFAULT_VALUE);
 }
 
 static int baytrail_i2c_acquire(struct dw_i2c_dev *dev)
@@ -143,6 +153,8 @@  int i2c_dw_eval_lock_support(struct dw_i2c_dev *dev)
 			return -EPROBE_DEFER;
 
 		dev_info(dev->dev, "I2C bus managed by PUNIT\n");
+		pm_qos_add_request(&dev->pm_qos, PM_QOS_CPU_DMA_LATENCY,
+				   PM_QOS_DEFAULT_VALUE);
 		dev->acquire_lock = baytrail_i2c_acquire;
 		dev->release_lock = baytrail_i2c_release;
 		dev->pm_runtime_disabled = true;
diff --git a/drivers/i2c/busses/i2c-designware-core.h b/drivers/i2c/busses/i2c-designware-core.h
index fb143f5..47d284c 100644
--- a/drivers/i2c/busses/i2c-designware-core.h
+++ b/drivers/i2c/busses/i2c-designware-core.h
@@ -22,6 +22,7 @@ 
  *
  */
 
+#include <linux/pm_qos.h>
 
 #define DW_IC_CON_MASTER		0x1
 #define DW_IC_CON_SPEED_STD		0x2
@@ -67,6 +68,7 @@ 
  * @fp_lcnt: fast plus LCNT value
  * @hs_hcnt: high speed HCNT value
  * @hs_lcnt: high speed LCNT value
+ * @pm_qos: pm_qos_request used while holding a hardware lock on the bus
  * @acquire_lock: function to acquire a hardware lock on the bus
  * @release_lock: function to release a hardware lock on the bus
  * @pm_runtime_disabled: true if pm runtime is disabled
@@ -114,6 +116,7 @@  struct dw_i2c_dev {
 	u16			fp_lcnt;
 	u16			hs_hcnt;
 	u16			hs_lcnt;
+	struct pm_qos_request	pm_qos;
 	int			(*acquire_lock)(struct dw_i2c_dev *dev);
 	void			(*release_lock)(struct dw_i2c_dev *dev);
 	bool			pm_runtime_disabled;
diff --git a/drivers/i2c/busses/i2c-designware-platdrv.c b/drivers/i2c/busses/i2c-designware-platdrv.c
index 97a2ca1..6d72929 100644
--- a/drivers/i2c/busses/i2c-designware-platdrv.c
+++ b/drivers/i2c/busses/i2c-designware-platdrv.c
@@ -291,6 +291,9 @@  static int dw_i2c_plat_remove(struct platform_device *pdev)
 	if (!dev->pm_runtime_disabled)
 		pm_runtime_disable(&pdev->dev);
 
+	if (dev->acquire_lock)
+		pm_qos_remove_request(&dev->pm_qos);
+
 	return 0;
 }