diff mbox series

i2c: i801: Fix interrupt storm from SMB_ALERT signal

Message ID 20211019141700.764413-1-jarkko.nikula@linux.intel.com
State Superseded
Headers show
Series i2c: i801: Fix interrupt storm from SMB_ALERT signal | expand

Commit Message

Jarkko Nikula Oct. 19, 2021, 2:17 p.m. UTC
Currently interrupt storm will occur from i2-i801 after first transaction
if SMB_ALERT signal is enabled and ever asserted, even before the driver
is loaded and does not recover because that interrupt is not
acknowledged.

This fix aims to fix it by two ways:
- Add acknowledging for the SMB_ALERT interrupt status
- Disable the SMB_ALERT interrupt on platforms where possible since the
  driver currently does not make use for it

Acknowledging resets the SMB_ALERT interrupt status on all platforms and
also should help to avoid interrupt storm on older platforms where the
SMB_ALERT interrupt disabling is not available.

For simplicity this fix reuses the host notify feature for disabling and
restoring original register value.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=177311
Reported-by: ck+kernelbugzilla@bl4ckb0x.de
Reported-by: stephane.poignant@protonmail.com
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
---
Hi Conrad and Stephane. This patch is otherwise the same than the one I
had in bugzilla but this adds also acknowledging for the SMB_ALERT
interrupt. There is short time window during driver load and unload
where interrupt storm will still occur if signal was asserted. Also
interrupt disabling is possible only on ICH3 and later so interrupt
acknowledging should also help those old platforms.
---
 drivers/i2c/busses/i2c-i801.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

Comments

Jean Delvare Oct. 26, 2021, 9:28 p.m. UTC | #1
Hi Jarkko,

On Tue, 19 Oct 2021 17:17:00 +0300, Jarkko Nikula wrote:
> Currently interrupt storm will occur from i2-i801 after first transaction

Typo: i2c-i801.

> if SMB_ALERT signal is enabled and ever asserted, even before the driver
> is loaded and does not recover because that interrupt is not
> acknowledged.

Thank you very much for looking into this old bug and finally figuring
it out. Great job!

So basically the interrupt storm begins as soon as SMBHSTCNT_INTREN is
set, because that bit controls both regular SMBus transactions and
SMBALERT# (but not Host Notify which has its own interrupt enable
control bit).

> This fix aims to fix it by two ways:
> - Add acknowledging for the SMB_ALERT interrupt status
> - Disable the SMB_ALERT interrupt on platforms where possible since the
>   driver currently does not make use for it
> 
> Acknowledging resets the SMB_ALERT interrupt status on all platforms and
> also should help to avoid interrupt storm on older platforms where the
> SMB_ALERT interrupt disabling is not available.
> 
> For simplicity this fix reuses the host notify feature for disabling and
> restoring original register value.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=177311
> Reported-by: ck+kernelbugzilla@bl4ckb0x.de
> Reported-by: stephane.poignant@protonmail.com
> Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
> ---
> Hi Conrad and Stephane. This patch is otherwise the same than the one I
> had in bugzilla but this adds also acknowledging for the SMB_ALERT
> interrupt. There is short time window during driver load and unload
> where interrupt storm will still occur if signal was asserted. Also

The storm only starts with the first transaction, loading the driver
does not start it, so I can't see the window at load time.

For driver unload time, maybe we should restore SMBHSTCNT_INTREN to
its original value at that time, to prevent an interrupt storm from
happening. If we don't, then I suspect we'll have a storm not only for
a short time window but actually forever.

> interrupt disabling is possible only on ICH3 and later so interrupt
> acknowledging should also help those old platforms.

I'd be very surprised if systems with pre-ICH3 chipsets are still up
and running. Even the ICH3 is 20 year old by now. So I wouldn't bother
with them.

> ---
>  drivers/i2c/busses/i2c-i801.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c
> index 115660dce722..e95de4ce6b64 100644
> --- a/drivers/i2c/busses/i2c-i801.c
> +++ b/drivers/i2c/busses/i2c-i801.c
> @@ -190,6 +190,7 @@
>  #define SMBSLVSTS_HST_NTFY_STS	BIT(0)
>  
>  /* Host Notify Command register bits */
> +#define SMBSLVCMD_SMBALERT_DISABLE	BIT(2)
>  #define SMBSLVCMD_HST_NTFY_INTREN	BIT(0)
>  
>  #define STATUS_ERROR_FLAGS	(SMBHSTSTS_FAILED | SMBHSTSTS_BUS_ERR | \
> @@ -642,9 +643,11 @@ static irqreturn_t i801_isr(int irq, void *dev_id)
>  	 * Clear irq sources and report transaction result.
>  	 * ->status must be cleared before the next transaction is started.
>  	 */
> -	status &= SMBHSTSTS_INTR | STATUS_ERROR_FLAGS;
> -	if (status) {
> +	status &= SMBHSTSTS_INTR | STATUS_ERROR_FLAGS | SMBHSTSTS_SMBALERT_STS;
> +	if (status)
>  		outb_p(status, SMBHSTSTS(priv));

Might be worth a comment.

> +	status &= ~SMBHSTSTS_SMBALERT_STS;
> +	if (status) {
>  		priv->status = status;
>  		complete(&priv->done);
>  	}
> @@ -972,9 +975,9 @@ static void i801_enable_host_notify(struct i2c_adapter *adapter)
>  	if (!(priv->features & FEATURE_HOST_NOTIFY))
>  		return;
>  
> -	if (!(SMBSLVCMD_HST_NTFY_INTREN & priv->original_slvcmd))
> -		outb_p(SMBSLVCMD_HST_NTFY_INTREN | priv->original_slvcmd,
> -		       SMBSLVCMD(priv));
> +	/* Enable host notify interrupt and disable SMB_ALERT signal */
> +	outb_p(SMBSLVCMD_HST_NTFY_INTREN | SMBSLVCMD_SMBALERT_DISABLE |
> +	       priv->original_slvcmd, SMBSLVCMD(priv));

A more verbose comment would be welcome too. Right now we know why we
are doing that, but in a few years we won't remember.

>  
>  	/* clear Host Notify bit to allow a new notification */
>  	outb_p(SMBSLVSTS_HST_NTFY_STS, SMBSLVSTS(priv));

Tested-by: Jean Delvare <jdelvare@suse.de>

(But my system was not affected by the issue in the first place, so
it's only a non-regression test.)

I'm also wondering whether it would make sense to actually support
SMBus Alert in the driver. We have core support for it already. If
there are systems out there which generate such events, then maybe
there's some value to be added. But we'd need to know which devices are
sending them. For the time being I agree that just fixing the interrupt
storms is more important.
Jarkko Nikula Oct. 27, 2021, 1:31 p.m. UTC | #2
On 10/27/21 12:28 AM, Jean Delvare wrote:
> Hi Jarkko,
> 
> On Tue, 19 Oct 2021 17:17:00 +0300, Jarkko Nikula wrote:
>> Currently interrupt storm will occur from i2-i801 after first transaction
> 
> Typo: i2c-i801.
> 
Will fix.

>> Hi Conrad and Stephane. This patch is otherwise the same than the one I
>> had in bugzilla but this adds also acknowledging for the SMB_ALERT
>> interrupt. There is short time window during driver load and unload
>> where interrupt storm will still occur if signal was asserted. Also
> 
> The storm only starts with the first transaction, loading the driver
> does not start it, so I can't see the window at load time.
> 
> For driver unload time, maybe we should restore SMBHSTCNT_INTREN to
> its original value at that time, to prevent an interrupt storm from
> happening. If we don't, then I suspect we'll have a storm not only for
> a short time window but actually forever.
> 
I had to hack and retest how I got that load and unload time interrupts 
but managed to do it with a code that doesn't acknowledge SMB_ALERT. It 
wasn't actually straight after boot but hacking with rmmod/modprobe, 
transactions and SMB_ALERT during runtime. Here in simplest form:

0. Boot
1. Do transaction and get the SMBHSTCNT_INTREN enabled
2. SMB_ALERT or rmmod
    - if SMB_ALERT first, then interrupts during this unload
    - if rmmod first, then interrupts during next unload
3. modprobe
    - SMBHSTCNT_INTREN already set
    -> PCI_STATUS_INTERRUPT set, "An interrupt is pending!" followed by 
interrupts between devm_request_irq and i801_enable_host_notify where 
SMB_ALERT disabled by the code change.

>> -	status &= SMBHSTSTS_INTR | STATUS_ERROR_FLAGS;
>> -	if (status) {
>> +	status &= SMBHSTSTS_INTR | STATUS_ERROR_FLAGS | SMBHSTSTS_SMBALERT_STS;
>> +	if (status)
>>   		outb_p(status, SMBHSTSTS(priv));
> 
> Might be worth a comment.
> 
...
>> -	if (!(SMBSLVCMD_HST_NTFY_INTREN & priv->original_slvcmd))
>> -		outb_p(SMBSLVCMD_HST_NTFY_INTREN | priv->original_slvcmd,
>> -		       SMBSLVCMD(priv));
>> +	/* Enable host notify interrupt and disable SMB_ALERT signal */
>> +	outb_p(SMBSLVCMD_HST_NTFY_INTREN | SMBSLVCMD_SMBALERT_DISABLE |
>> +	       priv->original_slvcmd, SMBSLVCMD(priv));
> 
> A more verbose comment would be welcome too. Right now we know why we
> are doing that, but in a few years we won't remember.
> 
Agree to both, will update.

>>   
>>   	/* clear Host Notify bit to allow a new notification */
>>   	outb_p(SMBSLVSTS_HST_NTFY_STS, SMBSLVSTS(priv));
> 
> Tested-by: Jean Delvare <jdelvare@suse.de>
> 
> (But my system was not affected by the issue in the first place, so
> it's only a non-regression test.)
> 
I was fortunate to find a reference design in our lab where SMB_ALERT 
wasn't connected anywhere else than a pull-up and was able to 
relatively safely pull it down by a grounded wire. Just have to be 
careful to not touch anything else with the wire :-)

Jarkko
diff mbox series

Patch

diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c
index 115660dce722..e95de4ce6b64 100644
--- a/drivers/i2c/busses/i2c-i801.c
+++ b/drivers/i2c/busses/i2c-i801.c
@@ -190,6 +190,7 @@ 
 #define SMBSLVSTS_HST_NTFY_STS	BIT(0)
 
 /* Host Notify Command register bits */
+#define SMBSLVCMD_SMBALERT_DISABLE	BIT(2)
 #define SMBSLVCMD_HST_NTFY_INTREN	BIT(0)
 
 #define STATUS_ERROR_FLAGS	(SMBHSTSTS_FAILED | SMBHSTSTS_BUS_ERR | \
@@ -642,9 +643,11 @@  static irqreturn_t i801_isr(int irq, void *dev_id)
 	 * Clear irq sources and report transaction result.
 	 * ->status must be cleared before the next transaction is started.
 	 */
-	status &= SMBHSTSTS_INTR | STATUS_ERROR_FLAGS;
-	if (status) {
+	status &= SMBHSTSTS_INTR | STATUS_ERROR_FLAGS | SMBHSTSTS_SMBALERT_STS;
+	if (status)
 		outb_p(status, SMBHSTSTS(priv));
+	status &= ~SMBHSTSTS_SMBALERT_STS;
+	if (status) {
 		priv->status = status;
 		complete(&priv->done);
 	}
@@ -972,9 +975,9 @@  static void i801_enable_host_notify(struct i2c_adapter *adapter)
 	if (!(priv->features & FEATURE_HOST_NOTIFY))
 		return;
 
-	if (!(SMBSLVCMD_HST_NTFY_INTREN & priv->original_slvcmd))
-		outb_p(SMBSLVCMD_HST_NTFY_INTREN | priv->original_slvcmd,
-		       SMBSLVCMD(priv));
+	/* Enable host notify interrupt and disable SMB_ALERT signal */
+	outb_p(SMBSLVCMD_HST_NTFY_INTREN | SMBSLVCMD_SMBALERT_DISABLE |
+	       priv->original_slvcmd, SMBSLVCMD(priv));
 
 	/* clear Host Notify bit to allow a new notification */
 	outb_p(SMBSLVSTS_HST_NTFY_STS, SMBSLVSTS(priv));