Message ID | 20211019141700.764413-1-jarkko.nikula@linux.intel.com |
---|---|
State | Superseded |
Headers | show |
Series | i2c: i801: Fix interrupt storm from SMB_ALERT signal | expand |
Hi Jarkko, On Tue, 19 Oct 2021 17:17:00 +0300, Jarkko Nikula wrote: > Currently interrupt storm will occur from i2-i801 after first transaction Typo: i2c-i801. > if SMB_ALERT signal is enabled and ever asserted, even before the driver > is loaded and does not recover because that interrupt is not > acknowledged. Thank you very much for looking into this old bug and finally figuring it out. Great job! So basically the interrupt storm begins as soon as SMBHSTCNT_INTREN is set, because that bit controls both regular SMBus transactions and SMBALERT# (but not Host Notify which has its own interrupt enable control bit). > This fix aims to fix it by two ways: > - Add acknowledging for the SMB_ALERT interrupt status > - Disable the SMB_ALERT interrupt on platforms where possible since the > driver currently does not make use for it > > Acknowledging resets the SMB_ALERT interrupt status on all platforms and > also should help to avoid interrupt storm on older platforms where the > SMB_ALERT interrupt disabling is not available. > > For simplicity this fix reuses the host notify feature for disabling and > restoring original register value. > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=177311 > Reported-by: ck+kernelbugzilla@bl4ckb0x.de > Reported-by: stephane.poignant@protonmail.com > Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> > --- > Hi Conrad and Stephane. This patch is otherwise the same than the one I > had in bugzilla but this adds also acknowledging for the SMB_ALERT > interrupt. There is short time window during driver load and unload > where interrupt storm will still occur if signal was asserted. Also The storm only starts with the first transaction, loading the driver does not start it, so I can't see the window at load time. For driver unload time, maybe we should restore SMBHSTCNT_INTREN to its original value at that time, to prevent an interrupt storm from happening. If we don't, then I suspect we'll have a storm not only for a short time window but actually forever. > interrupt disabling is possible only on ICH3 and later so interrupt > acknowledging should also help those old platforms. I'd be very surprised if systems with pre-ICH3 chipsets are still up and running. Even the ICH3 is 20 year old by now. So I wouldn't bother with them. > --- > drivers/i2c/busses/i2c-i801.c | 13 ++++++++----- > 1 file changed, 8 insertions(+), 5 deletions(-) > > diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c > index 115660dce722..e95de4ce6b64 100644 > --- a/drivers/i2c/busses/i2c-i801.c > +++ b/drivers/i2c/busses/i2c-i801.c > @@ -190,6 +190,7 @@ > #define SMBSLVSTS_HST_NTFY_STS BIT(0) > > /* Host Notify Command register bits */ > +#define SMBSLVCMD_SMBALERT_DISABLE BIT(2) > #define SMBSLVCMD_HST_NTFY_INTREN BIT(0) > > #define STATUS_ERROR_FLAGS (SMBHSTSTS_FAILED | SMBHSTSTS_BUS_ERR | \ > @@ -642,9 +643,11 @@ static irqreturn_t i801_isr(int irq, void *dev_id) > * Clear irq sources and report transaction result. > * ->status must be cleared before the next transaction is started. > */ > - status &= SMBHSTSTS_INTR | STATUS_ERROR_FLAGS; > - if (status) { > + status &= SMBHSTSTS_INTR | STATUS_ERROR_FLAGS | SMBHSTSTS_SMBALERT_STS; > + if (status) > outb_p(status, SMBHSTSTS(priv)); Might be worth a comment. > + status &= ~SMBHSTSTS_SMBALERT_STS; > + if (status) { > priv->status = status; > complete(&priv->done); > } > @@ -972,9 +975,9 @@ static void i801_enable_host_notify(struct i2c_adapter *adapter) > if (!(priv->features & FEATURE_HOST_NOTIFY)) > return; > > - if (!(SMBSLVCMD_HST_NTFY_INTREN & priv->original_slvcmd)) > - outb_p(SMBSLVCMD_HST_NTFY_INTREN | priv->original_slvcmd, > - SMBSLVCMD(priv)); > + /* Enable host notify interrupt and disable SMB_ALERT signal */ > + outb_p(SMBSLVCMD_HST_NTFY_INTREN | SMBSLVCMD_SMBALERT_DISABLE | > + priv->original_slvcmd, SMBSLVCMD(priv)); A more verbose comment would be welcome too. Right now we know why we are doing that, but in a few years we won't remember. > > /* clear Host Notify bit to allow a new notification */ > outb_p(SMBSLVSTS_HST_NTFY_STS, SMBSLVSTS(priv)); Tested-by: Jean Delvare <jdelvare@suse.de> (But my system was not affected by the issue in the first place, so it's only a non-regression test.) I'm also wondering whether it would make sense to actually support SMBus Alert in the driver. We have core support for it already. If there are systems out there which generate such events, then maybe there's some value to be added. But we'd need to know which devices are sending them. For the time being I agree that just fixing the interrupt storms is more important.
On 10/27/21 12:28 AM, Jean Delvare wrote: > Hi Jarkko, > > On Tue, 19 Oct 2021 17:17:00 +0300, Jarkko Nikula wrote: >> Currently interrupt storm will occur from i2-i801 after first transaction > > Typo: i2c-i801. > Will fix. >> Hi Conrad and Stephane. This patch is otherwise the same than the one I >> had in bugzilla but this adds also acknowledging for the SMB_ALERT >> interrupt. There is short time window during driver load and unload >> where interrupt storm will still occur if signal was asserted. Also > > The storm only starts with the first transaction, loading the driver > does not start it, so I can't see the window at load time. > > For driver unload time, maybe we should restore SMBHSTCNT_INTREN to > its original value at that time, to prevent an interrupt storm from > happening. If we don't, then I suspect we'll have a storm not only for > a short time window but actually forever. > I had to hack and retest how I got that load and unload time interrupts but managed to do it with a code that doesn't acknowledge SMB_ALERT. It wasn't actually straight after boot but hacking with rmmod/modprobe, transactions and SMB_ALERT during runtime. Here in simplest form: 0. Boot 1. Do transaction and get the SMBHSTCNT_INTREN enabled 2. SMB_ALERT or rmmod - if SMB_ALERT first, then interrupts during this unload - if rmmod first, then interrupts during next unload 3. modprobe - SMBHSTCNT_INTREN already set -> PCI_STATUS_INTERRUPT set, "An interrupt is pending!" followed by interrupts between devm_request_irq and i801_enable_host_notify where SMB_ALERT disabled by the code change. >> - status &= SMBHSTSTS_INTR | STATUS_ERROR_FLAGS; >> - if (status) { >> + status &= SMBHSTSTS_INTR | STATUS_ERROR_FLAGS | SMBHSTSTS_SMBALERT_STS; >> + if (status) >> outb_p(status, SMBHSTSTS(priv)); > > Might be worth a comment. > ... >> - if (!(SMBSLVCMD_HST_NTFY_INTREN & priv->original_slvcmd)) >> - outb_p(SMBSLVCMD_HST_NTFY_INTREN | priv->original_slvcmd, >> - SMBSLVCMD(priv)); >> + /* Enable host notify interrupt and disable SMB_ALERT signal */ >> + outb_p(SMBSLVCMD_HST_NTFY_INTREN | SMBSLVCMD_SMBALERT_DISABLE | >> + priv->original_slvcmd, SMBSLVCMD(priv)); > > A more verbose comment would be welcome too. Right now we know why we > are doing that, but in a few years we won't remember. > Agree to both, will update. >> >> /* clear Host Notify bit to allow a new notification */ >> outb_p(SMBSLVSTS_HST_NTFY_STS, SMBSLVSTS(priv)); > > Tested-by: Jean Delvare <jdelvare@suse.de> > > (But my system was not affected by the issue in the first place, so > it's only a non-regression test.) > I was fortunate to find a reference design in our lab where SMB_ALERT wasn't connected anywhere else than a pull-up and was able to relatively safely pull it down by a grounded wire. Just have to be careful to not touch anything else with the wire :-) Jarkko
diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c index 115660dce722..e95de4ce6b64 100644 --- a/drivers/i2c/busses/i2c-i801.c +++ b/drivers/i2c/busses/i2c-i801.c @@ -190,6 +190,7 @@ #define SMBSLVSTS_HST_NTFY_STS BIT(0) /* Host Notify Command register bits */ +#define SMBSLVCMD_SMBALERT_DISABLE BIT(2) #define SMBSLVCMD_HST_NTFY_INTREN BIT(0) #define STATUS_ERROR_FLAGS (SMBHSTSTS_FAILED | SMBHSTSTS_BUS_ERR | \ @@ -642,9 +643,11 @@ static irqreturn_t i801_isr(int irq, void *dev_id) * Clear irq sources and report transaction result. * ->status must be cleared before the next transaction is started. */ - status &= SMBHSTSTS_INTR | STATUS_ERROR_FLAGS; - if (status) { + status &= SMBHSTSTS_INTR | STATUS_ERROR_FLAGS | SMBHSTSTS_SMBALERT_STS; + if (status) outb_p(status, SMBHSTSTS(priv)); + status &= ~SMBHSTSTS_SMBALERT_STS; + if (status) { priv->status = status; complete(&priv->done); } @@ -972,9 +975,9 @@ static void i801_enable_host_notify(struct i2c_adapter *adapter) if (!(priv->features & FEATURE_HOST_NOTIFY)) return; - if (!(SMBSLVCMD_HST_NTFY_INTREN & priv->original_slvcmd)) - outb_p(SMBSLVCMD_HST_NTFY_INTREN | priv->original_slvcmd, - SMBSLVCMD(priv)); + /* Enable host notify interrupt and disable SMB_ALERT signal */ + outb_p(SMBSLVCMD_HST_NTFY_INTREN | SMBSLVCMD_SMBALERT_DISABLE | + priv->original_slvcmd, SMBSLVCMD(priv)); /* clear Host Notify bit to allow a new notification */ outb_p(SMBSLVSTS_HST_NTFY_STS, SMBSLVSTS(priv));
Currently interrupt storm will occur from i2-i801 after first transaction if SMB_ALERT signal is enabled and ever asserted, even before the driver is loaded and does not recover because that interrupt is not acknowledged. This fix aims to fix it by two ways: - Add acknowledging for the SMB_ALERT interrupt status - Disable the SMB_ALERT interrupt on platforms where possible since the driver currently does not make use for it Acknowledging resets the SMB_ALERT interrupt status on all platforms and also should help to avoid interrupt storm on older platforms where the SMB_ALERT interrupt disabling is not available. For simplicity this fix reuses the host notify feature for disabling and restoring original register value. Link: https://bugzilla.kernel.org/show_bug.cgi?id=177311 Reported-by: ck+kernelbugzilla@bl4ckb0x.de Reported-by: stephane.poignant@protonmail.com Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> --- Hi Conrad and Stephane. This patch is otherwise the same than the one I had in bugzilla but this adds also acknowledging for the SMB_ALERT interrupt. There is short time window during driver load and unload where interrupt storm will still occur if signal was asserted. Also interrupt disabling is possible only on ICH3 and later so interrupt acknowledging should also help those old platforms. --- drivers/i2c/busses/i2c-i801.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)