i2c: i801: Fix SMBus ENXIO on resume from suspend.

Message ID 20171115170156.21532-1-alban.browaeys@gmail.com
State Under Review
Headers show
Series
  • i2c: i801: Fix SMBus ENXIO on resume from suspend.
Related show

Commit Message

Alban Browaeys Nov. 15, 2017, 5:01 p.m.
Disabling I2C_EN on resume as it was restored in
suspend when we set pci config to its original value.

ENXIO landing in rmi_smb_get_version,  i2c-i802 debug
pointing at i801_check_post and guesswork from reading
probe which has this tweak.

i801_smbus 0000:00:1f.3: No response
rmi4_smbus 0-002c: failed to get SMBus version number!
i801_smbus 0000:00:1f.3: No response
rmi4_physical rmi4-00: rmi_driver_reset_handler: Failed to read current IRQ mask.
i801_smbus 0000:00:1f.3: No response
rmi4_f01 rmi4-00.fn01: Failed to restore normal operation: -6.
rmi4_f01 rmi4-00.fn01: Resume failed with code -6.
rmi4_physical rmi4-00: Failed to suspend functions: -6
rmi4_smbus 0-002c: Failed to resume device: -6
rmi4_f03 rmi4-00.fn03: rmi_f03_pt_write: Failed to write to F03 TX register (-6).
i801_smbus 0000:00:1f.3: No response
rmi4_f03 rmi4-00.fn03: rmi_f03_pt_write: Failed to write to F03 TX register (-6).

Signed-off-by: Alban Browaeys <alban.browaeys@gmail.com>
---
 drivers/i2c/busses/i2c-i801.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Alban Browaeys Nov. 15, 2017, 8:25 p.m. | #1
My apologies.

This fix is a partial one: it only fixes the issue once
we unload then reload i2c-i801 .
This as when we load a second time  orig_hstcfg is set to the value the
first module load leftover.
Thus I will investigate the initial orig_hstcfg at boot to further the
fix.
I bet this only a few missing bits like HST_EN.

Best regards
Alban


Le mercredi 15 novembre 2017 à 18:01 +0100, Alban Browaeys a écrit :
> Disabling I2C_EN on resume as it was restored in
> suspend when we set pci config to its original value.
> 
> ENXIO landing in rmi_smb_get_version,  i2c-i802 debug
> pointing at i801_check_post and guesswork from reading
> probe which has this tweak.
> 
> i801_smbus 0000:00:1f.3: No response
> rmi4_smbus 0-002c: failed to get SMBus version number!
> i801_smbus 0000:00:1f.3: No response
> rmi4_physical rmi4-00: rmi_driver_reset_handler: Failed to read
> current IRQ mask.
> i801_smbus 0000:00:1f.3: No response
> rmi4_f01 rmi4-00.fn01: Failed to restore normal operation: -6.
> rmi4_f01 rmi4-00.fn01: Resume failed with code -6.
> rmi4_physical rmi4-00: Failed to suspend functions: -6
> rmi4_smbus 0-002c: Failed to resume device: -6
> rmi4_f03 rmi4-00.fn03: rmi_f03_pt_write: Failed to write to F03 TX
> register (-6).
> i801_smbus 0000:00:1f.3: No response
> rmi4_f03 rmi4-00.fn03: rmi_f03_pt_write: Failed to write to F03 TX
> register (-6).
> 
> Signed-off-by: Alban Browaeys <alban.browaeys@gmail.com>
> ---
>  drivers/i2c/busses/i2c-i801.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-
> i801.c
> index 9e12a53ef7b8..b7b6c8ee2176 100644
> --- a/drivers/i2c/busses/i2c-i801.c
> +++ b/drivers/i2c/busses/i2c-i801.c
> @@ -1707,9 +1707,14 @@ static int i801_suspend(struct device *dev)
>  
>  static int i801_resume(struct device *dev)
>  {
> +	unsigned char temp;
>  	struct pci_dev *pci_dev = to_pci_dev(dev);
>  	struct i801_priv *priv = pci_get_drvdata(pci_dev);
>  
> +	pci_read_config_byte(pci_dev, SMBHSTCFG, &temp);
> +	temp &= ~SMBHSTCFG_I2C_EN;	/* SMBus timing */
> +	pci_write_config_byte(pci_dev, SMBHSTCFG, temp);
> +
>  	i801_enable_host_notify(&priv->adapter);
>  
>  	return 0;
Jean Delvare Nov. 20, 2017, 9:15 a.m. | #2
On Wed, 15 Nov 2017 21:25:25 +0100, Alban Browaeys wrote:
> This fix is a partial one: it only fixes the issue once
> we unload then reload i2c-i801 .
> This as when we load a second time  orig_hstcfg is set to the value the
> first module load leftover.
> Thus I will investigate the initial orig_hstcfg at boot to further the
> fix.
> I bet this only a few missing bits like HST_EN.

Volker, I think you had a fix for that one? But I can't find it in the
list archive :-(
Alban Browaeys Nov. 20, 2017, 2:18 p.m. | #3
I am on one of those Intel Haswell with the ACPI OpRegion conflict, the
acpi handler was added to cope with.

Could it be the embedded controller code mess with the smbus ?
When I print the OpRegion fields via acpi-call I get different values
after suspend/resume from S3 than at runtime at various points.

I already checked that the acpi SBUS methods are not called.
To do so I define a global name set to zero and set it to 1 in
SBUS.STRT (called by all SBUS method to verify the ready state).

Could it be the EC access the region directly ?
The acpi handler from i2c-i801 does not trigger. Is it in effect when
suspending or resuming ?

The reproducer seems hard to get right, I am still investigating.
It involves resume from suspend (S3) and a yet to define state.
Most of the time after an acpi reset (magic sysrq 'b').
And / or early load of i2x-i801, rmi_smbus and psmouse.

To call the acpi methods I echo to /proc/acpi/call from the acpi-call
module. 

I tried https://github.com/pali/i2c-acpi-sbus .
It fails early as acpi sbus methods check the inuse host status flag.
It is always on thus they bail out. i2c-i801 does not test inuse
(0x40).


Best regards
Alban


Le lundi 20 novembre 2017 à 10:15 +0100, Jean Delvare a écrit :
> On Wed, 15 Nov 2017 21:25:25 +0100, Alban Browaeys wrote:
> > This fix is a partial one: it only fixes the issue once
> > we unload then reload i2c-i801 .
> > This as when we load a second time  orig_hstcfg is set to the value
> > the
> > first module load leftover.
> > Thus I will investigate the initial orig_hstcfg at boot to further
> > the
> > fix.
> > I bet this only a few missing bits like HST_EN.
> 
> Volker, I think you had a fix for that one? But I can't find it in
> the
> list archive :-(
>
Jean Delvare Nov. 21, 2017, 11:49 a.m. | #4
On Tue, 21 Nov 2017 09:47:41 +0100, Volker Rümelin wrote:
> > Volker, I think you had a fix for that one? But I can't find it in the
> > list archive :-(
> Yes, on 13 Jun 2017 I sent a patch to the linux-i2c mailing list. But
> the mailing list server told me: Your address is not liked source for
> email. At this point I gave up.

I see, that's why it's in my mailbox but missing from the archive.
 
> If you are still interested I can try to resend the patch with my
> google email account.

I'll review it, so it will become visible :-)

Thanks,
Wolfram Sang Dec. 7, 2017, 10:57 a.m. | #5
> > > Volker, I think you had a fix for that one? But I can't find it in the
> > > list archive :-(
> > Yes, on 13 Jun 2017 I sent a patch to the linux-i2c mailing list. But
> > the mailing list server told me: Your address is not liked source for
> > email. At this point I gave up.
> 
> I see, that's why it's in my mailbox but missing from the archive.
>  
> > If you are still interested I can try to resend the patch with my
> > google email account.
> 
> I'll review it, so it will become visible :-)

So, Volker's patch showed up on the ML meanwhile, nice :)

I don't get if we need this patch as well or if this gets covered with
Volker's patch as well?
Alban Browaeys Dec. 7, 2017, 9:11 p.m. | #6
Thanks for Volker's patch.
I applied it and it does not help on this Intel Haswell box (Lenovo
Thinkpad Yoga S1). My initial patch did not helped either.

I was confused as the bug reproduce easely only if the i2c-i801,
rmi_smbus and psmouse are loaded once then resume from suspend.
Afterwards (even unload module) I sometimes get around the bug.

Could ACPI/EC OpRegion accesses happen without the kernel handler
triggering (while suspending and / or resuming) ?

Thanks
Alban



Le lundi 20 novembre 2017 à 15:18 +0100, Alban Browaeys a écrit :
> I am on one of those Intel Haswell with the ACPI OpRegion conflict,
> the
> acpi handler was added to cope with.
> 
> Could it be the embedded controller code mess with the smbus ?
> When I print the OpRegion fields via acpi-call I get different values
> after suspend/resume from S3 than at runtime at various points.
> 
> I already checked that the acpi SBUS methods are not called.
> To do so I define a global name set to zero and set it to 1 in
> SBUS.STRT (called by all SBUS method to verify the ready state).
> 
> Could it be the EC access the region directly ?
> The acpi handler from i2c-i801 does not trigger. Is it in effect when
> suspending or resuming ?
> 
> The reproducer seems hard to get right, I am still investigating.
> It involves resume from suspend (S3) and a yet to define state.
> Most of the time after an acpi reset (magic sysrq 'b').
> And / or early load of i2x-i801, rmi_smbus and psmouse.
> 
> To call the acpi methods I echo to /proc/acpi/call from the acpi-call
> module. 
> 
> I tried https://github.com/pali/i2c-acpi-sbus .
> It fails early as acpi sbus methods check the inuse host status flag.
> It is always on thus they bail out. i2c-i801 does not test inuse
> (0x40).
> 
> 
> Best regards
> Alban
> 
> 
> Le lundi 20 novembre 2017 à 10:15 +0100, Jean Delvare a écrit :
> > On Wed, 15 Nov 2017 21:25:25 +0100, Alban Browaeys wrote:
> > > This fix is a partial one: it only fixes the issue once
> > > we unload then reload i2c-i801 .
> > > This as when we load a second time  orig_hstcfg is set to the
> > > value
> > > the
> > > first module load leftover.
> > > Thus I will investigate the initial orig_hstcfg at boot to
> > > further
> > > the
> > > fix.
> > > I bet this only a few missing bits like HST_EN.
> > 
> > Volker, I think you had a fix for that one? But I can't find it in
> > the
> > list archive :-(
Alban Browaeys Dec. 11, 2017, 11:45 a.m. | #7
Le jeudi 07 décembre 2017 à 11:57 +0100, Wolfram Sang a écrit :
> > > > Volker, I think you had a fix for that one? But I can't find it
> > > > in the
> > > > list archive :-(
> > > 
> > > Yes, on 13 Jun 2017 I sent a patch to the linux-i2c mailing list.
> > > But
> > > the mailing list server told me: Your address is not liked source
> > > for
> > > email. At this point I gave up.
> > 
> > I see, that's why it's in my mailbox but missing from the archive.
> >  
> > > If you are still interested I can try to resend the patch with my
> > > google email account.
> > 
> > I'll review it, so it will become visible :-)
> 
> So, Volker's patch showed up on the ML meanwhile, nice :)
> 
> I don't get if we need this patch as well or if this gets covered
> with
> Volker's patch as well?

Mine is a rough duplicate of Volker.

But it turns out none of those take care of the ENXIO issue. I thought
otherwise at first but this bug is not easely reproducible except on
cold boot.
I am at loss as to how to fix my ENXIO issue.

Thanks,
Alban

Patch

diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c
index 9e12a53ef7b8..b7b6c8ee2176 100644
--- a/drivers/i2c/busses/i2c-i801.c
+++ b/drivers/i2c/busses/i2c-i801.c
@@ -1707,9 +1707,14 @@  static int i801_suspend(struct device *dev)
 
 static int i801_resume(struct device *dev)
 {
+	unsigned char temp;
 	struct pci_dev *pci_dev = to_pci_dev(dev);
 	struct i801_priv *priv = pci_get_drvdata(pci_dev);
 
+	pci_read_config_byte(pci_dev, SMBHSTCFG, &temp);
+	temp &= ~SMBHSTCFG_I2C_EN;	/* SMBus timing */
+	pci_write_config_byte(pci_dev, SMBHSTCFG, temp);
+
 	i801_enable_host_notify(&priv->adapter);
 
 	return 0;