i2c: mux: pca954x: use relaxed locking of the top i2c adapter during mux-locked muxing

Message ID 20180427084746.5105-1-bst@pengutronix.de
State Superseded
Delegated to: Peter Rosin
Headers show
Series
  • i2c: mux: pca954x: use relaxed locking of the top i2c adapter during mux-locked muxing
Related show

Commit Message

Bastian Stender April 27, 2018, 8:47 a.m.
With an i2c device behind a PCA9540 mux having CONFIG_I2C_DEBUG_BUS enabled
connection timeouts caused by a busy i2c bus can be observed:

  i2c i2c-3: master_xfer[0] W, addr=0x57, len=2
  i2c i2c-3: master_xfer[1] R, addr=0x57, len=128
  i2c i2c-2: <i2c_imx_xfer>
  i2c i2c-2: <i2c_imx_start>
  i2c i2c-2: <i2c_imx_bus_busy>
  i2c i2c-2: <i2c_imx_bus_busy> I2C bus is busy
  i2c i2c-2: <i2c_imx_xfer> exit with: error: -110

This happens due to the locking problem described in 6ef91fcca8a8
("i2c: mux: relax locking of the top i2c adapter during mux-locked muxing"):

The cause of the problem is that the mux is a i2c client on the same i2c bus
that it muxes. Transfers to the mux clients will lock the whole i2c bus prior
to attempting to switch the mux to the correct i2c segment.

The mentioned commit introduced a new locking concept as "mux-locked"
muxes so that these muxes lock only the muxes on the parent adapter
instead of the whole i2c bus when there is a transfer to the slave side of
the mux.

Make use of this new locking concept: use the introduced flag I2C_MUX_LOCKED
along with lock-aware i2c_transfer() instead of __i2c_transfer().

Signed-off-by: Bastian Stender <bst@pengutronix.de>
---
 drivers/i2c/muxes/i2c-mux-pca954x.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Comments

Peter Rosin April 27, 2018, 9:25 a.m. | #1
On 2018-04-27 10:47, Bastian Stender wrote:
> With an i2c device behind a PCA9540 mux having CONFIG_I2C_DEBUG_BUS enabled
> connection timeouts caused by a busy i2c bus can be observed:
> 
>   i2c i2c-3: master_xfer[0] W, addr=0x57, len=2
>   i2c i2c-3: master_xfer[1] R, addr=0x57, len=128
>   i2c i2c-2: <i2c_imx_xfer>
>   i2c i2c-2: <i2c_imx_start>
>   i2c i2c-2: <i2c_imx_bus_busy>
>   i2c i2c-2: <i2c_imx_bus_busy> I2C bus is busy
>   i2c i2c-2: <i2c_imx_xfer> exit with: error: -110
> 
> This happens due to the locking problem described in 6ef91fcca8a8
> ("i2c: mux: relax locking of the top i2c adapter during mux-locked muxing"):
> 
> The cause of the problem is that the mux is a i2c client on the same i2c bus
> that it muxes. Transfers to the mux clients will lock the whole i2c bus prior
> to attempting to switch the mux to the correct i2c segment.
> 
> The mentioned commit introduced a new locking concept as "mux-locked"
> muxes so that these muxes lock only the muxes on the parent adapter
> instead of the whole i2c bus when there is a transfer to the slave side of
> the mux.

This can't be the whole story, since the pca954x driver carefully uses
the unlocked __i2c_transfer. So, what device is it that sits behind the
mux that runs into this problem? And what do your i2c topology look
like?

I suspect your deadlocking device has some kind of interaction with some
other device on the some root i2c bus as the mux. I.e. you should not see
a deadlock because of the i2c interaction to update the mux itself, but
because of some other i2c interaction that isn't unlocked.

> Make use of this new locking concept: use the introduced flag I2C_MUX_LOCKED
> along with lock-aware i2c_transfer() instead of __i2c_transfer().


> Signed-off-by: Bastian Stender <bst@pengutronix.de>
> ---
>  drivers/i2c/muxes/i2c-mux-pca954x.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/i2c/muxes/i2c-mux-pca954x.c b/drivers/i2c/muxes/i2c-mux-pca954x.c
> index 09bafd3e68fa..0ea970eaa324 100644
> --- a/drivers/i2c/muxes/i2c-mux-pca954x.c
> +++ b/drivers/i2c/muxes/i2c-mux-pca954x.c
> @@ -230,7 +230,7 @@ static int pca954x_reg_write(struct i2c_adapter *adap,
>  		msg.len = 1;
>  		buf[0] = val;
>  		msg.buf = buf;
> -		ret = __i2c_transfer(adap, &msg, 1);
> +		ret = i2c_transfer(adap, &msg, 1);
>  

This is not complete, since you do not fix the unlocked SMBus access below,
which you must do if you switch to mux-locked.

However, *IF* this driver can be changed to be mux-locked (which it probably
can't, there are implications...) the whole of pca954x_reg_write should be
removed and call sites updated with regular i2c_smbus_write_byte calls (or
whatever is appropriate, I didn't check all that carefully).

I would love it if this was possible, and it is for the simple cases which
is somewhat annoying. But if you consider some of the more complex examples
in Documentation/i2c/i2c-topology you will see that it probably can't be
done without causing problems elsewhere.

Cheers,
Peter

>  		if (ret >= 0 && ret != 1)
>  			ret = -EREMOTEIO;
> @@ -380,8 +380,9 @@ static int pca954x_probe(struct i2c_client *client,
>  		return -ENODEV;
>  
>  	muxc = i2c_mux_alloc(adap, &client->dev,
> -			     PCA954X_MAX_NCHANS, sizeof(*data), 0,
> -			     pca954x_select_chan, pca954x_deselect_mux);
> +			     PCA954X_MAX_NCHANS, sizeof(*data),
> +			     I2C_MUX_LOCKED, pca954x_select_chan,
> +			     pca954x_deselect_mux);
>  	if (!muxc)
>  		return -ENOMEM;
>  	data = i2c_mux_priv(muxc);
>
Bastian Stender April 27, 2018, 10:38 a.m. | #2
Hi Peter,

On 04/27/2018 11:25 AM, Peter Rosin wrote:
> On 2018-04-27 10:47, Bastian Stender wrote:
>> With an i2c device behind a PCA9540 mux having CONFIG_I2C_DEBUG_BUS enabled
>> connection timeouts caused by a busy i2c bus can be observed:
>>
>>    i2c i2c-3: master_xfer[0] W, addr=0x57, len=2
>>    i2c i2c-3: master_xfer[1] R, addr=0x57, len=128
>>    i2c i2c-2: <i2c_imx_xfer>
>>    i2c i2c-2: <i2c_imx_start>
>>    i2c i2c-2: <i2c_imx_bus_busy>
>>    i2c i2c-2: <i2c_imx_bus_busy> I2C bus is busy
>>    i2c i2c-2: <i2c_imx_xfer> exit with: error: -110
>>
>> This happens due to the locking problem described in 6ef91fcca8a8
>> ("i2c: mux: relax locking of the top i2c adapter during mux-locked muxing"):
>>
>> The cause of the problem is that the mux is a i2c client on the same i2c bus
>> that it muxes. Transfers to the mux clients will lock the whole i2c bus prior
>> to attempting to switch the mux to the correct i2c segment.
>>
>> The mentioned commit introduced a new locking concept as "mux-locked"
>> muxes so that these muxes lock only the muxes on the parent adapter
>> instead of the whole i2c bus when there is a transfer to the slave side of
>> the mux.
> 
> This can't be the whole story, since the pca954x driver carefully uses
> the unlocked __i2c_transfer. So, what device is it that sits behind the
> mux that runs into this problem? And what do your i2c topology look
> like?

There are multiple EEPROMs (AT24C32D) and gpio expanders (PCF8575)
behind the mux:

                              EEPROM  EEPROM  EEPROM   EEPROM
                                |        |       |       |
          RTC     TMP          -+--------+-------+-------+--
           |       |          /
I2C  --+--+---+---+----+-- MUX  GPIO_EXPANDER     EEPROM
        |      |        |     \         |             |
       PMIC  EEPROM    ADC     --+------+--+-------+--+---
                                 |         |       |
                               EEPROM    EEPROM  GPIO_EXPANDER

> I suspect your deadlocking device has some kind of interaction with some
> other device on the some root i2c bus as the mux. I.e. you should not see
> a deadlock because of the i2c interaction to update the mux itself, but
> because of some other i2c interaction that isn't unlocked.

I am not aware of any interaction between the devices. Without the patch
I see the EEPROMs getting probed correctly. Within the probe function
i2c reads are performed and work. After that I can interact with the
devices on the bus, but not with the devices behind the mux.

Sometimes I see:

$ hexdump /sys/bus/i2c/devices/3-0057/eeprom
hexdump: /sys/bus/i2c/devices/3-0057/eeprom: Connection timed out

Sometimes the whole system just hangs, I do not see any cause/log messages.

The patch prevents all that. Any idea what could cause this?

>> Make use of this new locking concept: use the introduced flag I2C_MUX_LOCKED
>> along with lock-aware i2c_transfer() instead of __i2c_transfer().
> 
> 
>> Signed-off-by: Bastian Stender <bst@pengutronix.de>
>> ---
>>   drivers/i2c/muxes/i2c-mux-pca954x.c | 7 ++++---
>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/i2c/muxes/i2c-mux-pca954x.c b/drivers/i2c/muxes/i2c-mux-pca954x.c
>> index 09bafd3e68fa..0ea970eaa324 100644
>> --- a/drivers/i2c/muxes/i2c-mux-pca954x.c
>> +++ b/drivers/i2c/muxes/i2c-mux-pca954x.c
>> @@ -230,7 +230,7 @@ static int pca954x_reg_write(struct i2c_adapter *adap,
>>   		msg.len = 1;
>>   		buf[0] = val;
>>   		msg.buf = buf;
>> -		ret = __i2c_transfer(adap, &msg, 1);
>> +		ret = i2c_transfer(adap, &msg, 1);
>>   
> 
> This is not complete, since you do not fix the unlocked SMBus access below,
> which you must do if you switch to mux-locked.

Oh, right. I am always getting into the if path, so the missing else
path did not strike me.

> However, *IF* this driver can be changed to be mux-locked (which it probably
> can't, there are implications...) the whole of pca954x_reg_write should be
> removed and call sites updated with regular i2c_smbus_write_byte calls (or
> whatever is appropriate, I didn't check all that carefully).
> 
> I would love it if this was possible, and it is for the simple cases which
> is somewhat annoying. But if you consider some of the more complex examples
> in Documentation/i2c/i2c-topology you will see that it probably can't be
> done without causing problems elsewhere.

Okay, I see. I will do my best to make it work without mux-locked then.

Regards,
Bastian
Peter Rosin April 27, 2018, 11:43 a.m. | #3
On 2018-04-27 12:38, Bastian Stender wrote:
> Hi Peter,
> 
> On 04/27/2018 11:25 AM, Peter Rosin wrote:
>> On 2018-04-27 10:47, Bastian Stender wrote:
>>> With an i2c device behind a PCA9540 mux having CONFIG_I2C_DEBUG_BUS enabled
>>> connection timeouts caused by a busy i2c bus can be observed:
>>>
>>>    i2c i2c-3: master_xfer[0] W, addr=0x57, len=2
>>>    i2c i2c-3: master_xfer[1] R, addr=0x57, len=128
>>>    i2c i2c-2: <i2c_imx_xfer>
>>>    i2c i2c-2: <i2c_imx_start>
>>>    i2c i2c-2: <i2c_imx_bus_busy>
>>>    i2c i2c-2: <i2c_imx_bus_busy> I2C bus is busy
>>>    i2c i2c-2: <i2c_imx_xfer> exit with: error: -110
>>>
>>> This happens due to the locking problem described in 6ef91fcca8a8
>>> ("i2c: mux: relax locking of the top i2c adapter during mux-locked muxing"):
>>>
>>> The cause of the problem is that the mux is a i2c client on the same i2c bus
>>> that it muxes. Transfers to the mux clients will lock the whole i2c bus prior
>>> to attempting to switch the mux to the correct i2c segment.
>>>
>>> The mentioned commit introduced a new locking concept as "mux-locked"
>>> muxes so that these muxes lock only the muxes on the parent adapter
>>> instead of the whole i2c bus when there is a transfer to the slave side of
>>> the mux.
>>
>> This can't be the whole story, since the pca954x driver carefully uses
>> the unlocked __i2c_transfer. So, what device is it that sits behind the
>> mux that runs into this problem? And what do your i2c topology look
>> like?
> 
> There are multiple EEPROMs (AT24C32D) and gpio expanders (PCF8575)
> behind the mux:
> 
>                               EEPROM  EEPROM  EEPROM   EEPROM
>                                 |        |       |       |
>          RTC     TMP           -+--------+-------+-------+--
>           |       |           /
> I2C  --+--+---+---+----+-- MUX  GPIO_EXPANDER     EEPROM
>        |      |        |      \         |             |
>       PMIC  EEPROM    ADC      --+------+--+-------+--+---
>                                  |         |       |
>                                EEPROM    EEPROM  GPIO_EXPANDER
> 
>> I suspect your deadlocking device has some kind of interaction with some
>> other device on the some root i2c bus as the mux. I.e. you should not see
>> a deadlock because of the i2c interaction to update the mux itself, but
>> because of some other i2c interaction that isn't unlocked.
> 
> I am not aware of any interaction between the devices. Without the patch
> I see the EEPROMs getting probed correctly. Within the probe function
> i2c reads are performed and work. After that I can interact with the
> devices on the bus, but not with the devices behind the mux.
> 
> Sometimes I see:
> 
> $ hexdump /sys/bus/i2c/devices/3-0057/eeprom
> hexdump: /sys/bus/i2c/devices/3-0057/eeprom: Connection timed out
> 
> Sometimes the whole system just hangs, I do not see any cause/log messages.
> 
> The patch prevents all that. Any idea what could cause this?

What are the GPIO expanders used for? Is the PMIC perhaps involved with
some device on the other side of the mux and that's what's triggering
the deadlock?

Have you tried running with lockdep? I don't think it will tell you
all that much since the i2c locks are rt_mutexes and are not tracked
IIRC, but maybe?

*time passes*

Looking at the driver for at24c32 I see that the WP pin may be
controlled by a gpio driver. Is the eeprom WP-pins perhaps connected
to the gpio-expanders? Could that lead to a dead-lock somewhere?

These are the kinds of cross-device interactions on the i2c bus that
you should be looking for...

>>> Make use of this new locking concept: use the introduced flag I2C_MUX_LOCKED
>>> along with lock-aware i2c_transfer() instead of __i2c_transfer().
>>
>>
>>> Signed-off-by: Bastian Stender <bst@pengutronix.de>
>>> ---
>>>   drivers/i2c/muxes/i2c-mux-pca954x.c | 7 ++++---
>>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/i2c/muxes/i2c-mux-pca954x.c b/drivers/i2c/muxes/i2c-mux-pca954x.c
>>> index 09bafd3e68fa..0ea970eaa324 100644
>>> --- a/drivers/i2c/muxes/i2c-mux-pca954x.c
>>> +++ b/drivers/i2c/muxes/i2c-mux-pca954x.c
>>> @@ -230,7 +230,7 @@ static int pca954x_reg_write(struct i2c_adapter *adap,
>>>   		msg.len = 1;
>>>   		buf[0] = val;
>>>   		msg.buf = buf;
>>> -		ret = __i2c_transfer(adap, &msg, 1);
>>> +		ret = i2c_transfer(adap, &msg, 1);
>>>   
>>
>> This is not complete, since you do not fix the unlocked SMBus access below,
>> which you must do if you switch to mux-locked.
> 
> Oh, right. I am always getting into the if path, so the missing else
> path did not strike me.
> 
>> However, *IF* this driver can be changed to be mux-locked (which it probably
>> can't, there are implications...) the whole of pca954x_reg_write should be
>> removed and call sites updated with regular i2c_smbus_write_byte calls (or
>> whatever is appropriate, I didn't check all that carefully).
>>
>> I would love it if this was possible, and it is for the simple cases which
>> is somewhat annoying. But if you consider some of the more complex examples
>> in Documentation/i2c/i2c-topology you will see that it probably can't be
>> done without causing problems elsewhere.
> 
> Okay, I see. I will do my best to make it work without mux-locked then.

If you can't, mux-locked could perhaps be made optional with a DT property or
something. See the binding for i2c-mux-gpmux for precedent.

Cheers,
Peter
Bastian Stender May 2, 2018, 6:52 a.m. | #4
Hi Peter,

On 04/27/2018 01:43 PM, Peter Rosin wrote:
> On 2018-04-27 12:38, Bastian Stender wrote:
>> Hi Peter,
>> 
>> On 04/27/2018 11:25 AM, Peter Rosin wrote:
>>> On 2018-04-27 10:47, Bastian Stender wrote:
>>>> With an i2c device behind a PCA9540 mux having
>>>> CONFIG_I2C_DEBUG_BUS enabled connection timeouts caused by a
>>>> busy i2c bus can be observed:
>>>> 
>>>> i2c i2c-3: master_xfer[0] W, addr=0x57, len=2 i2c i2c-3:
>>>> master_xfer[1] R, addr=0x57, len=128 i2c i2c-2: <i2c_imx_xfer> 
>>>> i2c i2c-2: <i2c_imx_start> i2c i2c-2: <i2c_imx_bus_busy> i2c
>>>> i2c-2: <i2c_imx_bus_busy> I2C bus is busy i2c i2c-2:
>>>> <i2c_imx_xfer> exit with: error: -110
>>>> 
>>>> This happens due to the locking problem described in
>>>> 6ef91fcca8a8 ("i2c: mux: relax locking of the top i2c adapter
>>>> during mux-locked muxing"):
>>>> 
>>>> The cause of the problem is that the mux is a i2c client on the
>>>> same i2c bus that it muxes. Transfers to the mux clients will
>>>> lock the whole i2c bus prior to attempting to switch the mux to
>>>> the correct i2c segment.
>>>> 
>>>> The mentioned commit introduced a new locking concept as
>>>> "mux-locked" muxes so that these muxes lock only the muxes on
>>>> the parent adapter instead of the whole i2c bus when there is a
>>>> transfer to the slave side of the mux.
>>> 
>>> This can't be the whole story, since the pca954x driver carefully
>>> uses the unlocked __i2c_transfer. So, what device is it that sits
>>> behind the mux that runs into this problem? And what do your i2c
>>> topology look like?
>> 
>> There are multiple EEPROMs (AT24C32D) and gpio expanders (PCF8575) 
>> behind the mux:
>> 
>> EEPROM  EEPROM  EEPROM   EEPROM |        |       |       | RTC
>> TMP           -+--------+-------+-------+-- |       |           / 
>> I2C  --+--+---+---+----+-- MUX  GPIO_EXPANDER     EEPROM |      |
>> |      \         |             | PMIC  EEPROM    ADC
>> --+------+--+-------+--+--- |         |       | EEPROM    EEPROM
>> GPIO_EXPANDER
>> 
>>> I suspect your deadlocking device has some kind of interaction
>>> with some other device on the some root i2c bus as the mux. I.e.
>>> you should not see a deadlock because of the i2c interaction to
>>> update the mux itself, but because of some other i2c interaction
>>> that isn't unlocked.
>> 
>> I am not aware of any interaction between the devices. Without the
>> patch I see the EEPROMs getting probed correctly. Within the probe
>> function i2c reads are performed and work. After that I can
>> interact with the devices on the bus, but not with the devices
>> behind the mux.
>> 
>> Sometimes I see:
>> 
>> $ hexdump /sys/bus/i2c/devices/3-0057/eeprom hexdump:
>> /sys/bus/i2c/devices/3-0057/eeprom: Connection timed out
>> 
>> Sometimes the whole system just hangs, I do not see any cause/log
>> messages.
>> 
>> The patch prevents all that. Any idea what could cause this?
> 
> What are the GPIO expanders used for? Is the PMIC perhaps involved
> with some device on the other side of the mux and that's what's
> triggering the deadlock?

I was wrong, only one GPIO expander is on the bus. It is used for some
GPIO buttons, no interaction with other devices here. The PMIC is not
involved either.

> Have you tried running with lockdep? I don't think it will tell you 
> all that much since the i2c locks are rt_mutexes and are not tracked 
> IIRC, but maybe?

Yes, I tried running with lockdep but did not see anything.

> *time passes*
> 
> Looking at the driver for at24c32 I see that the WP pin may be 
> controlled by a gpio driver. Is the eeprom WP-pins perhaps connected 
> to the gpio-expanders? Could that lead to a dead-lock somewhere?

No, the WP pin is hardwired. And I did not even try to write to the
EEPROM, only read operations.

> These are the kinds of cross-device interactions on the i2c bus that 
> you should be looking for...

After reviewing the schematics again I could not find anything
unfortunately.

>>>> Make use of this new locking concept: use the introduced flag
>>>> I2C_MUX_LOCKED along with lock-aware i2c_transfer() instead of
>>>> __i2c_transfer().
>>> 
>>> 
>>>> Signed-off-by: Bastian Stender <bst@pengutronix.de> --- 
>>>> drivers/i2c/muxes/i2c-mux-pca954x.c | 7 ++++--- 1 file changed,
>>>> 4 insertions(+), 3 deletions(-)
>>>> 
>>>> diff --git a/drivers/i2c/muxes/i2c-mux-pca954x.c
>>>> b/drivers/i2c/muxes/i2c-mux-pca954x.c index
>>>> 09bafd3e68fa..0ea970eaa324 100644 ---
>>>> a/drivers/i2c/muxes/i2c-mux-pca954x.c +++
>>>> b/drivers/i2c/muxes/i2c-mux-pca954x.c @@ -230,7 +230,7 @@
>>>> static int pca954x_reg_write(struct i2c_adapter *adap, msg.len
>>>> = 1; buf[0] = val; msg.buf = buf; -		ret = __i2c_transfer(adap,
>>>> &msg, 1); +		ret = i2c_transfer(adap, &msg, 1);
>>>> 
>>> 
>>> This is not complete, since you do not fix the unlocked SMBus
>>> access below, which you must do if you switch to mux-locked.
>> 
>> Oh, right. I am always getting into the if path, so the missing
>> else path did not strike me.
>> 
>>> However, *IF* this driver can be changed to be mux-locked (which
>>> it probably can't, there are implications...) the whole of
>>> pca954x_reg_write should be removed and call sites updated with
>>> regular i2c_smbus_write_byte calls (or whatever is appropriate, I
>>> didn't check all that carefully).
>>> 
>>> I would love it if this was possible, and it is for the simple
>>> cases which is somewhat annoying. But if you consider some of the
>>> more complex examples in Documentation/i2c/i2c-topology you will
>>> see that it probably can't be done without causing problems
>>> elsewhere.
>> 
>> Okay, I see. I will do my best to make it work without mux-locked
>> then.
> 
> If you can't, mux-locked could perhaps be made optional with a DT
> property or something. See the binding for i2c-mux-gpmux for
> precedent.

Sounds good. I will change the patch to handle opt-in mux-locked then.

Regards,
Bastian

Patch

diff --git a/drivers/i2c/muxes/i2c-mux-pca954x.c b/drivers/i2c/muxes/i2c-mux-pca954x.c
index 09bafd3e68fa..0ea970eaa324 100644
--- a/drivers/i2c/muxes/i2c-mux-pca954x.c
+++ b/drivers/i2c/muxes/i2c-mux-pca954x.c
@@ -230,7 +230,7 @@  static int pca954x_reg_write(struct i2c_adapter *adap,
 		msg.len = 1;
 		buf[0] = val;
 		msg.buf = buf;
-		ret = __i2c_transfer(adap, &msg, 1);
+		ret = i2c_transfer(adap, &msg, 1);
 
 		if (ret >= 0 && ret != 1)
 			ret = -EREMOTEIO;
@@ -380,8 +380,9 @@  static int pca954x_probe(struct i2c_client *client,
 		return -ENODEV;
 
 	muxc = i2c_mux_alloc(adap, &client->dev,
-			     PCA954X_MAX_NCHANS, sizeof(*data), 0,
-			     pca954x_select_chan, pca954x_deselect_mux);
+			     PCA954X_MAX_NCHANS, sizeof(*data),
+			     I2C_MUX_LOCKED, pca954x_select_chan,
+			     pca954x_deselect_mux);
 	if (!muxc)
 		return -ENOMEM;
 	data = i2c_mux_priv(muxc);