diff mbox

eeprom: at24: Add support for large EEPROMs connected to SMBus adapters

Message ID 1423067017-27607-1-git-send-email-linux@roeck-us.net
State Deferred
Headers show

Commit Message

Guenter Roeck Feb. 4, 2015, 4:23 p.m. UTC
Large EEPROMS (24c32 and larger) require a two-byte data address
instead of just a single byte. Implement support for such EEPROMs
with SMBus commands.

Support has limitations (reads are not multi-master safe) and is slow,
but it works. Practical use is for a system with 24c32 connected to
Intel 82801I (ICH9).

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/misc/eeprom/at24.c | 52 +++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 47 insertions(+), 5 deletions(-)

Comments

Wolfram Sang Feb. 4, 2015, 5:47 p.m. UTC | #1
On Wed, Feb 04, 2015 at 08:23:37AM -0800, Guenter Roeck wrote:
> Large EEPROMS (24c32 and larger) require a two-byte data address
> instead of just a single byte. Implement support for such EEPROMs
> with SMBus commands.
> 
> Support has limitations (reads are not multi-master safe) and is slow,
> but it works. Practical use is for a system with 24c32 connected to
> Intel 82801I (ICH9).

Can't you simply use i2c-dev to access the EEPROM? In multi-master
environments, things can really go wrong, so I wouldn't like to add
something dangerous by default. Maybe with a module parameter named
"allow-multimaster-unsafe-access-to-large-eeproms-with-smbus" which is
default off. But I'd really prefer the i2c-dev solution. Hooking a 16bit
EEPROM to SMBus is daring, after all. SMBus is multi-master, too.
Guenter Roeck Feb. 4, 2015, 7:08 p.m. UTC | #2
On Wed, Feb 04, 2015 at 06:47:23PM +0100, Wolfram Sang wrote:
> 
> On Wed, Feb 04, 2015 at 08:23:37AM -0800, Guenter Roeck wrote:
> > Large EEPROMS (24c32 and larger) require a two-byte data address
> > instead of just a single byte. Implement support for such EEPROMs
> > with SMBus commands.
> > 
> > Support has limitations (reads are not multi-master safe) and is slow,
> > but it works. Practical use is for a system with 24c32 connected to
> > Intel 82801I (ICH9).
> 
> Can't you simply use i2c-dev to access the EEPROM? In multi-master
> environments, things can really go wrong, so I wouldn't like to add
> something dangerous by default. Maybe with a module parameter named
> "allow-multimaster-unsafe-access-to-large-eeproms-with-smbus" which is
> default off. But I'd really prefer the i2c-dev solution. Hooking a 16bit
> EEPROM to SMBus is daring, after all. SMBus is multi-master, too.
> 
Hi Wolfram,

At the same time multi-master access is quite rare. Also, many of the
kernel's i2c drivers are not multi-master-access clean. In many cases
that isn't even possible due to the chip architecture (a good example
are chips with multiple 'pages', where the page address is set with
one i2c command and the actual access occurs with subsequent commands).
Mandating that the at24 driver shall be multi-master clean doesn't help
all those other drivers or chips and only creates a false sense of security.
The only (somewhat) clean means to support multi-master i2c access is
to use bus master selector chips such as the pca9541 in a design.

No, we are not going to use i2c-dev, as it would require breaking our
user space / kernel ABI just because this one eeprom is not supported
by the kernel driver. It would also be even more risky as the i2c-dev access
would have to be limited to a single application (ie it would not even be
single-master clean). I don't like the idea, and I don't understand the
rationale, but if the patch is not acceptable upstream we'll just carry
it locally.

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck Feb. 4, 2015, 8:33 p.m. UTC | #3
On Wed, Feb 04, 2015 at 06:47:23PM +0100, Wolfram Sang wrote:
> 
> On Wed, Feb 04, 2015 at 08:23:37AM -0800, Guenter Roeck wrote:
> > Large EEPROMS (24c32 and larger) require a two-byte data address
> > instead of just a single byte. Implement support for such EEPROMs
> > with SMBus commands.
> > 
> > Support has limitations (reads are not multi-master safe) and is slow,
> > but it works. Practical use is for a system with 24c32 connected to
> > Intel 82801I (ICH9).
> 
> Can't you simply use i2c-dev to access the EEPROM? In multi-master
> environments, things can really go wrong, so I wouldn't like to add
> something dangerous by default. Maybe with a module parameter named
> "allow-multimaster-unsafe-access-to-large-eeproms-with-smbus" which is
> default off. But I'd really prefer the i2c-dev solution. Hooking a 16bit
> EEPROM to SMBus is daring, after all. SMBus is multi-master, too.
> 

Hi Wolfram,

thinking about it some more, the problem you are concerned about really is
that 24c32 and larger EEPROMs can not be reliably supported in multi-master
configurations with SMBus adapters.

My solution to that problem would be not to build such hardware. Your
solution appears to be not to support 24c32 and larger EEPROMs on SMBUs
adapters at all, which I consider highly restrictive.

Please reconsider.

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wolfram Sang Feb. 4, 2015, 11:35 p.m. UTC | #4
On Wed, Feb 04, 2015 at 11:08:19AM -0800, Guenter Roeck wrote:
> On Wed, Feb 04, 2015 at 06:47:23PM +0100, Wolfram Sang wrote:
> > 
> > On Wed, Feb 04, 2015 at 08:23:37AM -0800, Guenter Roeck wrote:
> > > Large EEPROMS (24c32 and larger) require a two-byte data address
> > > instead of just a single byte. Implement support for such EEPROMs
> > > with SMBus commands.
> > > 
> > > Support has limitations (reads are not multi-master safe) and is slow,
> > > but it works. Practical use is for a system with 24c32 connected to
> > > Intel 82801I (ICH9).
> > 
> > Can't you simply use i2c-dev to access the EEPROM? In multi-master
> > environments, things can really go wrong, so I wouldn't like to add
> > something dangerous by default. Maybe with a module parameter named
> > "allow-multimaster-unsafe-access-to-large-eeproms-with-smbus" which is
> > default off. But I'd really prefer the i2c-dev solution. Hooking a 16bit
> > EEPROM to SMBus is daring, after all. SMBus is multi-master, too.
> > 
> Hi Wolfram,
> 
> At the same time multi-master access is quite rare.

It should still work in those rare cases. A setup of an SoC, an EC
(those like to be masters, as well), and an EEPROM is not very far off.

> Also, many of the kernel's i2c drivers are not multi-master-access
> clean. In many cases that isn't even possible due to the chip
> architecture (a good example are chips with multiple 'pages', where
> the page address is set with one i2c command and the actual access
> occurs with subsequent commands).

Huh, using i2c_transfer to ensure 'repeated start' between messages
should do the trick, no? Drivers doing this via SMBus calls are broken
and should be fixed.

And what do you think of my suggestion enabling your mode with a module
parameter?
Guenter Roeck Feb. 5, 2015, 12:26 a.m. UTC | #5
On Thu, Feb 05, 2015 at 12:35:16AM +0100, Wolfram Sang wrote:
> On Wed, Feb 04, 2015 at 11:08:19AM -0800, Guenter Roeck wrote:
> > On Wed, Feb 04, 2015 at 06:47:23PM +0100, Wolfram Sang wrote:
> > > 
> > > On Wed, Feb 04, 2015 at 08:23:37AM -0800, Guenter Roeck wrote:
> > > > Large EEPROMS (24c32 and larger) require a two-byte data address
> > > > instead of just a single byte. Implement support for such EEPROMs
> > > > with SMBus commands.
> > > > 
> > > > Support has limitations (reads are not multi-master safe) and is slow,
> > > > but it works. Practical use is for a system with 24c32 connected to
> > > > Intel 82801I (ICH9).
> > > 
> > > Can't you simply use i2c-dev to access the EEPROM? In multi-master
> > > environments, things can really go wrong, so I wouldn't like to add
> > > something dangerous by default. Maybe with a module parameter named
> > > "allow-multimaster-unsafe-access-to-large-eeproms-with-smbus" which is
> > > default off. But I'd really prefer the i2c-dev solution. Hooking a 16bit
> > > EEPROM to SMBus is daring, after all. SMBus is multi-master, too.
> > > 
> > Hi Wolfram,
> > 
> > At the same time multi-master access is quite rare.
> 
> It should still work in those rare cases. A setup of an SoC, an EC
> (those like to be masters, as well), and an EEPROM is not very far off.
> 
It won't work if the 24c32 is connected to a controller which only supports
SMBus transactions. Also, there is no problem if it is connected to a
controller supporting direct i2c accesses.

The only real solution is to not use an SMBus-only controller together
with 24c32 or any similar chip in such situations.

> > Also, many of the kernel's i2c drivers are not multi-master-access
> > clean. In many cases that isn't even possible due to the chip
> > architecture (a good example are chips with multiple 'pages', where
> > the page address is set with one i2c command and the actual access
> > occurs with subsequent commands).
> 
> Huh, using i2c_transfer to ensure 'repeated start' between messages
> should do the trick, no? Drivers doing this via SMBus calls are broken
> and should be fixed.
> 

That would require all those drivers to have two separate implementations,
or to declare that any such chips must only be connected to controllers
supporting direct i2c accesses. Both isn't really practical and defies
the real world, where any kind of chip will be connected to SMBus-only
controllers. A worse situation would occur if the chip specifically
(only) supports SMBus transactions and thus doesn't explicitly support
repeated start after completion of one transaction. This may or may not
work depending on the chip or even chip revision, and it usually would
not be specified as a valid transaction in the chip datasheet.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wolfram Sang Feb. 5, 2015, 2:40 p.m. UTC | #6
> > > At the same time multi-master access is quite rare.
> > 
> > It should still work in those rare cases. A setup of an SoC, an EC
> > (those like to be masters, as well), and an EEPROM is not very far off.

To clarify: What I meant here is, at24 should also work in those rare
cases reliably (like the described case of SoC, EC and EEPROM). Your
patch sacrificies this reliability.

> It won't work if the 24c32 is connected to a controller which only supports
> SMBus transactions. Also, there is no problem if it is connected to a
> controller supporting direct i2c accesses.

I know and agree.

> The only real solution is to not use an SMBus-only controller together
> with 24c32 or any similar chip in such situations.

Exactly.

> > Huh, using i2c_transfer to ensure 'repeated start' between messages
> > should do the trick, no? Drivers doing this via SMBus calls are broken
> > and should be fixed.
> > 
> 
> That would require all those drivers to have two separate implementations,
> or to declare that any such chips must only be connected to controllers
> supporting direct i2c accesses.

Yes, TBH I though this was the case. Which drivers did you find using
unsafe SMBus transactions?

> Both isn't really practical and defies the real world, where any kind
> of chip will be connected to SMBus-only controllers.

Is it that bad? The vast majority of controllers I come across in the
embedded world are more like I2C than SMBus controllers. They have their
quirks but most support repeated starts somehow. This is why I came up
with the quirk infrastructure BTW, to describe what the controller is
capable of and what not. So, users can check if the controller is up to
what the user needs and then they can decide what to do if they
encounter a too limited controller.

Back to topic: I2C and SMBus are multi-master busses. Drivers have to
cope with that. If we say "let's assume single master busses, because
most of them out there are exactly that", then we open a window for
very subtle and hard to find bugs for the multi-master case. And people
would get rightfully angry IMO, because they should expect specified
things to work.

However, if people are connecting I2C devices to an SMBus controller,
well, they ask for prob^H^H^H^H special cases and they need to deal with
that. And using a module parameter is not much to deal with if you ask
me. BTW another idea I got was that a user could mark an SMBus
controller as "single-master", so a driver can check for that and use
some fallback.

I am really all for the "better-safe-than-sorry" approach here. I am not
objecting if somebody wants to move out of that area. It needs some form
of "yes, I really want to do that" however.


> A worse situation would occur if the chip specifically
> (only) supports SMBus transactions and thus doesn't explicitly support
> repeated start after completion of one transaction.

You lost me here. Is "the chip" the master or the slave?

> This may or may not
> work depending on the chip or even chip revision, and it usually would
> not be specified as a valid transaction in the chip datasheet.

Can you elaborate, please?

Thanks,

   Wolfram
Guenter Roeck Feb. 5, 2015, 5:53 p.m. UTC | #7
On Thu, Feb 05, 2015 at 03:40:28PM +0100, Wolfram Sang wrote:
> > > > At the same time multi-master access is quite rare.
> > > 
> > > It should still work in those rare cases. A setup of an SoC, an EC
> > > (those like to be masters, as well), and an EEPROM is not very far off.
> 
> To clarify: What I meant here is, at24 should also work in those rare
> cases reliably (like the described case of SoC, EC and EEPROM). Your
> patch sacrificies this reliability.
> 
But it won't, and it can't. 24c32 should simply not be used in such cases.
Besides, at least on PCs, ACPI should reserve and thus block SMBus access
if it is used by an EC (or IPMI or similar).

I don't think it should (or even can) be the Linux kernel's responsibility
to ensure that the hardware does not prevent such problems.

> > It won't work if the 24c32 is connected to a controller which only supports
> > SMBus transactions. Also, there is no problem if it is connected to a
> > controller supporting direct i2c accesses.
> 
> I know and agree.
> 
> > The only real solution is to not use an SMBus-only controller together
> > with 24c32 or any similar chip in such situations.
> 
> Exactly.
> 
> > > Huh, using i2c_transfer to ensure 'repeated start' between messages
> > > should do the trick, no? Drivers doing this via SMBus calls are broken
> > > and should be fixed.
> > > 
> > 
> > That would require all those drivers to have two separate implementations,
> > or to declare that any such chips must only be connected to controllers
> > supporting direct i2c accesses.
> 
> Yes, TBH I though this was the case. Which drivers did you find using
> unsafe SMBus transactions?
> 

An easy example would be multi-bank i2c hardware monitoring chips.
Those will typically be connected to the ICH and thus depend on SMBus
transactions (besides for the most part being specified as SMBus chips),
and obviously require multiple transactions in a sequence (though
normally the bank register is simply cached and not rewritten).

But there are many other examples.

As a starting point, all i2c muxes and by implication drivers for chips
behind them are not multi-master safe. There is nothing preventing
master #2 from changing the mux port after master #1 selected it.

That is just a start. It is hard to imagine that any I2C based TPM chip
is multi-master safe, much less its driver. At least some I2C GPIO expanders
require multiple SMBus transactions (look for pca957X). Others cache output
bytes and only update bits when updating pin values (pcf857x). It is also
difficult to imagine that any i2c clock chips can be programmed or
even polled with a single SMBus transaction.

Another interesting example is regmap-i2c, specifically regmap_update_bits.
This function performs an i2c read followed by some bit masking and a write.
Obviously that is not multi-master safe, no matter what you do.
Any other i2c chip driver which has to do a read-modify-write sequence
is by definition not multi-master safe (including most if not all i2c
gpio expanders).

Browsing through code, I also see accelerometer drivers using multi-command
SMBus transactions. At least some adc drivers do the same. Humidity sensors do
it, both in hwmon and iio. Various light sensors do it. Magnetometer sensors
do it. Keyboard drivers do it. And so on; at this point I stopped looking.

> > Both isn't really practical and defies the real world, where any kind
> > of chip will be connected to SMBus-only controllers.
> 
> Is it that bad? The vast majority of controllers I come across in the
> embedded world are more like I2C than SMBus controllers. They have their
> quirks but most support repeated starts somehow. This is why I came up

I encounter a lot of chips connected to ICH. Obviously all of them are
affected.

> with the quirk infrastructure BTW, to describe what the controller is
> capable of and what not. So, users can check if the controller is up to
> what the user needs and then they can decide what to do if they
> encounter a too limited controller.
> 
> Back to topic: I2C and SMBus are multi-master busses. Drivers have to
> cope with that. If we say "let's assume single master busses, because
> most of them out there are exactly that", then we open a window for
> very subtle and hard to find bugs for the multi-master case. And people
> would get rightfully angry IMO, because they should expect specified
> things to work.
> 
I don't think that is correct; there would have to be a lot of angry
people out there if that would be the case.

> However, if people are connecting I2C devices to an SMBus controller,
> well, they ask for prob^H^H^H^H special cases and they need to deal with
> that. And using a module parameter is not much to deal with if you ask
> me. BTW another idea I got was that a user could mark an SMBus
> controller as "single-master", so a driver can check for that and use
> some fallback.
> 
You would have to add that module parameter to a lot of drivers. And you would
have to rewrite a lot of drivers to use i2c transactions instead of SMBus
transactions. As mentioned above, in read-modify-write cases that isn't even
possible, or at least I would have no idea how to implement it.

> I am really all for the "better-safe-than-sorry" approach here. I am not
> objecting if somebody wants to move out of that area. It needs some form
> of "yes, I really want to do that" however.
> 
How about adding a warning to the driver, such as the following,
if chips such as the 24c32 are connected to an SMBus controller ?

	"reading data from the chip in multi-master systems is not reliable"

> 
> > A worse situation would occur if the chip specifically
> > (only) supports SMBus transactions and thus doesn't explicitly support
> > repeated start after completion of one transaction.
> 
> You lost me here. Is "the chip" the master or the slave?
> 
Slave.

> > This may or may not
> > work depending on the chip or even chip revision, and it usually would
> > not be specified as a valid transaction in the chip datasheet.
> 
> Can you elaborate, please?
> 
As an example, a normal write byte sequence is
	<s><addr><w><A><command><A><data><A><p>
and read sequence is
	<s><addr><w><A><command><A><sr><addr><r><A><data><N><p>

In your case you would want to combine that to something like
	<s><addr><w><A><command><A><data><A>
	<sr><addr><w><A><command><A><sr><addr><r><A><data><N><p>

I am not aware of a SMBus (client) chip which explicitly specifies
that it would support such combined SMBUs transactions in a single
sequence without stop condition in between. Maybe there are some,
but I never encountered any. I also don't see such a sequence
specified in the SMBus standard, though maybe I am missing it.
Given that, I would be very hesitant to implement and rely on
support for such a sequence in any of my drivers.

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck Feb. 12, 2015, 4:01 a.m. UTC | #8
On Thu, Feb 05, 2015 at 09:53:26AM -0800, Guenter Roeck wrote:
> On Thu, Feb 05, 2015 at 03:40:28PM +0100, Wolfram Sang wrote:
> > > > > At the same time multi-master access is quite rare.
> > > > 
> > > > It should still work in those rare cases. A setup of an SoC, an EC
> > > > (those like to be masters, as well), and an EEPROM is not very far off.
> > 
> > To clarify: What I meant here is, at24 should also work in those rare
> > cases reliably (like the described case of SoC, EC and EEPROM). Your
> > patch sacrificies this reliability.
> > 
> But it won't, and it can't. 24c32 should simply not be used in such cases.
> Besides, at least on PCs, ACPI should reserve and thus block SMBus access
> if it is used by an EC (or IPMI or similar).
> 
> I don't think it should (or even can) be the Linux kernel's responsibility
> to ensure that the hardware does not prevent such problems.
> 
Hi Wolfram,

I wonder where we are with thisp patch; I don't recall a reply to my previous
e-mail.

Do you need some more time to think about it ? Otherwise I'll publish an
out-of-tree version of the at24 driver with the patch applied on github,
for those who might need the functionality provided by this patch.

Please let me know.

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wolfram Sang Feb. 16, 2015, 12:09 p.m. UTC | #9
Hi Guenter,

> I wonder where we are with thisp patch; I don't recall a reply to my previous
> e-mail.

Sorry for the late reply. I needed to recover from a HDD headcrash :(

> Do you need some more time to think about it ? Otherwise I'll publish an
> out-of-tree version of the at24 driver with the patch applied on github,
> for those who might need the functionality provided by this patch.

Your last mail made me aware of why we were missing each other before. I
see your point now, but yes, still need to think about it. My plan is to
have a decision until the 3.21 merge window.

All the best,

   Wolfram
Guenter Roeck Feb. 16, 2015, 3:37 p.m. UTC | #10
On 02/16/2015 04:09 AM, Wolfram Sang wrote:
> Hi Guenter,
>
>> I wonder where we are with thisp patch; I don't recall a reply to my previous
>> e-mail.
>
> Sorry for the late reply. I needed to recover from a HDD headcrash :(
>
>> Do you need some more time to think about it ? Otherwise I'll publish an
>> out-of-tree version of the at24 driver with the patch applied on github,
>> for those who might need the functionality provided by this patch.
>
> Your last mail made me aware of why we were missing each other before. I
> see your point now, but yes, still need to think about it. My plan is to
> have a decision until the 3.21 merge window.
>

Ok, fair enough.

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck March 17, 2015, 4:20 a.m. UTC | #11
On Mon, Feb 16, 2015 at 01:09:51PM +0100, Wolfram Sang wrote:
> Hi Guenter,
> 
> > I wonder where we are with thisp patch; I don't recall a reply to my previous
> > e-mail.
> 
> Sorry for the late reply. I needed to recover from a HDD headcrash :(
> 
> > Do you need some more time to think about it ? Otherwise I'll publish an
> > out-of-tree version of the at24 driver with the patch applied on github,
> > for those who might need the functionality provided by this patch.
> 
> Your last mail made me aware of why we were missing each other before. I
> see your point now, but yes, still need to think about it. My plan is to
> have a decision until the 3.21 merge window.
> 
Hi Wolfram,

any news ?

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wolfram Sang March 18, 2015, 1:27 p.m. UTC | #12
On Mon, Mar 16, 2015 at 09:20:49PM -0700, Guenter Roeck wrote:
> On Mon, Feb 16, 2015 at 01:09:51PM +0100, Wolfram Sang wrote:
> > Hi Guenter,
> > 
> > > I wonder where we are with thisp patch; I don't recall a reply to my previous
> > > e-mail.
> > 
> > Sorry for the late reply. I needed to recover from a HDD headcrash :(
> > 
> > > Do you need some more time to think about it ? Otherwise I'll publish an
> > > out-of-tree version of the at24 driver with the patch applied on github,
> > > for those who might need the functionality provided by this patch.
> > 
> > Your last mail made me aware of why we were missing each other before. I
> > see your point now, but yes, still need to think about it. My plan is to
> > have a decision until the 3.21 merge window.
> > 
> Hi Wolfram,
> 
> any news ?

Yes :)

The main misunderstanding we had before was: You were talking about
multi-master safety between transfers, while I was thinking about
multi-master safety between messages. While we need to guarantee this
for the latter, you are right about the former, sadly. True multi-master
safety between transfers is probably like a can of worms currently.

Still, I think we have a race with your patch when having two read
processes. If b) kicks in after a) has just set the eeprom pointer, a)
will not read the data it wants. For that to prevent, we should take the
adapter_lock during those two transfers needed for the read you
implemented. My preferred solution would be to have __smbus_transfer
like we have __i2c_transfer and then using that. Some mux code could
also make use out of that. But if you are going to use
adapter->algo->smbus_xfer() directly, well, then be it.

Thanks,

   Wolfram
Guenter Roeck March 19, 2015, 3:24 a.m. UTC | #13
On 03/18/2015 06:27 AM, Wolfram Sang wrote:
> On Mon, Mar 16, 2015 at 09:20:49PM -0700, Guenter Roeck wrote:
>> On Mon, Feb 16, 2015 at 01:09:51PM +0100, Wolfram Sang wrote:
>>> Hi Guenter,
>>>
>>>> I wonder where we are with thisp patch; I don't recall a reply to my previous
>>>> e-mail.
>>>
>>> Sorry for the late reply. I needed to recover from a HDD headcrash :(
>>>
>>>> Do you need some more time to think about it ? Otherwise I'll publish an
>>>> out-of-tree version of the at24 driver with the patch applied on github,
>>>> for those who might need the functionality provided by this patch.
>>>
>>> Your last mail made me aware of why we were missing each other before. I
>>> see your point now, but yes, still need to think about it. My plan is to
>>> have a decision until the 3.21 merge window.
>>>
>> Hi Wolfram,
>>
>> any news ?
>
> Yes :)
>
> The main misunderstanding we had before was: You were talking about
> multi-master safety between transfers, while I was thinking about
> multi-master safety between messages. While we need to guarantee this
> for the latter, you are right about the former, sadly. True multi-master
> safety between transfers is probably like a can of worms currently.
>
> Still, I think we have a race with your patch when having two read
> processes. If b) kicks in after a) has just set the eeprom pointer, a)
> will not read the data it wants. For that to prevent, we should take the
> adapter_lock during those two transfers needed for the read you
> implemented. My preferred solution would be to have __smbus_transfer
> like we have __i2c_transfer and then using that. Some mux code could
> also make use out of that. But if you are going to use
> adapter->algo->smbus_xfer() directly, well, then be it.
>

You are right, that is a problem. Not for eeprom access itself - that already
has a mutex - but for parallel access to the chip through the eeprom file
and, say, by i2cdump.

I don't call that multi-master, though, so I guess we may have a bit of a
terminology problem.

I'll see what I can come up with, but I am not sure if I'll find the time
before the 4.1 commit window opens. Company has a working solution (kind of),
so now I'll have to do this on my own time ;-).

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wolfram Sang March 19, 2015, 8:16 a.m. UTC | #14
> I don't call that multi-master, though, so I guess we may have a bit of a
> terminology problem.

This is definately not a multi-master issue, I agree. It is just
another issue I saw when thinking about your patch thoroughly again.

> I'll see what I can come up with, but I am not sure if I'll find the time
> before the 4.1 commit window opens. Company has a working solution (kind of),
> so now I'll have to do this on my own time ;-).

But their solution is buggy and needs a fix! :)
Guenter Roeck March 19, 2015, 1:30 p.m. UTC | #15
On 03/19/2015 01:16 AM, Wolfram Sang wrote:
>
>> I don't call that multi-master, though, so I guess we may have a bit of a
>> terminology problem.
>
> This is definately not a multi-master issue, I agree. It is just
> another issue I saw when thinking about your patch thoroughly again.
>
>> I'll see what I can come up with, but I am not sure if I'll find the time
>> before the 4.1 commit window opens. Company has a working solution (kind of),
>> so now I'll have to do this on my own time ;-).
>
> But their solution is buggy and needs a fix! :)
>
I ended up looking into this last night, and actually have some untested code.

However, looking through the kernel, the problem it solves turns out to be
wide-spread. Almost every caller of i2c_smbus_read_byte() does the call as
part of a call sequence, and thus has that very same problem. That includes,
for example, the max1363 and the ds2482 drivers, both of which are used
in our system. So, while the solution may be buggy, that bug is wide-spread
and no one really seems to care about it (or did not realize it). This gives
"needs a fix" a completely different scope.

Given that, I may spend some time trying to see if I can reproduce the problem
before trying to fix it.

Thanks,
Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guenter Roeck March 19, 2015, 5:43 p.m. UTC | #16
On Thu, Mar 19, 2015 at 09:16:21AM +0100, Wolfram Sang wrote:
> 
> > I don't call that multi-master, though, so I guess we may have a bit of a
> > terminology problem.
> 
> This is definately not a multi-master issue, I agree. It is just
> another issue I saw when thinking about your patch thoroughly again.
> 
> > I'll see what I can come up with, but I am not sure if I'll find the time
> > before the 4.1 commit window opens. Company has a working solution (kind of),
> > so now I'll have to do this on my own time ;-).
> 
> But their solution is buggy and needs a fix! :)
> 
Turns out this is really easy to reproduce. One process reads
the eeprom over and over again, another runs i2cdump in a loop,
and voila ... lots of corruptions. Scary, especially considering
how wide-spread this kind of i2c access is in the kernel.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wolfram Sang March 19, 2015, 9:39 p.m. UTC | #17
> Turns out this is really easy to reproduce. One process reads
> the eeprom over and over again, another runs i2cdump in a loop,
> and voila ... lots of corruptions. Scary, especially considering
> how wide-spread this kind of i2c access is in the kernel.

A coccinelle script should at least be able to find vulnerable code
paths, maybe even fix it. But not today for me... Thanks for testing and
sharing the results!
Guenter Roeck March 25, 2015, 2:11 p.m. UTC | #18
On 03/19/2015 02:39 PM, Wolfram Sang wrote:
>
>> Turns out this is really easy to reproduce. One process reads
>> the eeprom over and over again, another runs i2cdump in a loop,
>> and voila ... lots of corruptions. Scary, especially considering
>> how wide-spread this kind of i2c access is in the kernel.
>
> A coccinelle script should at least be able to find vulnerable code
> paths, maybe even fix it. But not today for me... Thanks for testing and
> sharing the results!
>

Wolfram,

just to give you an update: I do have some code, but it is a bit messy,
and it doesn't work well for ds2482 (the chip behind it still hangs up
if I access it in parallel through i2c-dev). On top of that, it causes
pretty significant slow-downs when accessing other devices on the same
bus at the same time. Not surprising, I guess, since it expands the scope
of the bus lock significantly.

I thought about introducing a client lock, but that does not work because
of the way i2c-dev is written (creating its own 'shadow' client structure).
An address lock (ie a client lock based on <bus, address> instead of one
residing in the client structure) seems to be too expensive.

So right now I don't really know how to proceed, or if to proceed at all.
I'll think about it some more, but given how wide-spread the problem is
in the kernel, I might just leave it alone, and keep the at24 changes
out of tree.

Ultimately, the real problem is how i2c-dev accesses a client, not how
i2c client drivers (who assume they have exclusive access to a chip)
handle multi-command sequences. Forcing extensive locking on all drivers
because of i2c-dev just doesn't seem to be the right thing to do.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wolfram Sang March 25, 2015, 4:15 p.m. UTC | #19
Guenter,

thanks for the update

> Ultimately, the real problem is how i2c-dev accesses a client, not how
> i2c client drivers (who assume they have exclusive access to a chip)
> handle multi-command sequences. Forcing extensive locking on all drivers
> because of i2c-dev just doesn't seem to be the right thing to do.

I agree. i2c-dev is too much of a special case.

And since at24 has its own lock (I missed that), your patch might as
well be good enough to be applied, I'd think.

Thanks for this work!
Guenter Roeck March 25, 2015, 4:37 p.m. UTC | #20
On Wed, Mar 25, 2015 at 05:15:04PM +0100, Wolfram Sang wrote:
> Guenter,
> 
> thanks for the update
> 
> > Ultimately, the real problem is how i2c-dev accesses a client, not how
> > i2c client drivers (who assume they have exclusive access to a chip)
> > handle multi-command sequences. Forcing extensive locking on all drivers
> > because of i2c-dev just doesn't seem to be the right thing to do.
> 
> I agree. i2c-dev is too much of a special case.
> 
> And since at24 has its own lock (I missed that), your patch might as
> well be good enough to be applied, I'd think.
> 

Ah, sorry, I blindly assumed that you are aware of that. Yes, at24
itself is not the problem, it is parallel access to the chip by i2c-dev.

The same is actually true for all the other drivers I looked at;
usually they have their own lock(s), but such locks do not protect
against interference by i2c-dev.

The bad part is that i2c-dev is heavily used by user space at my
workplace, and that code happily messes with chips which are also
handled by kernel drivers. But as I said, I have no real good idea
how to fix that - neither the user-space code nor how i2c-dev
interfers with (or completely messes up) device access by drivers.

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wolfram Sang March 27, 2015, 8:09 a.m. UTC | #21
> just to give you an update: I do have some code, but it is a bit messy,
> and it doesn't work well for ds2482 (the chip behind it still hangs up
> if I access it in parallel through i2c-dev). On top of that, it causes
> pretty significant slow-downs when accessing other devices on the same
> bus at the same time. Not surprising, I guess, since it expands the scope
> of the bus lock significantly.

Just to get a better idea: Did you try taking the adapter_lock before
the two SMBus command which needed to be concatenated (and use
smbus_xfer directly)?
Guenter Roeck March 27, 2015, 12:51 p.m. UTC | #22
On 03/27/2015 01:09 AM, Wolfram Sang wrote:
>
>> just to give you an update: I do have some code, but it is a bit messy,
>> and it doesn't work well for ds2482 (the chip behind it still hangs up
>> if I access it in parallel through i2c-dev). On top of that, it causes
>> pretty significant slow-downs when accessing other devices on the same
>> bus at the same time. Not surprising, I guess, since it expands the scope
>> of the bus lock significantly.
>
> Just to get a better idea: Did you try taking the adapter_lock before
> the two SMBus command which needed to be concatenated (and use
> smbus_xfer directly)?
>
I did. I didn't use smbus_xfer directly, though, but introduced lockless
versions of the various smbus commands, and kept using those.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wolfram Sang March 27, 2015, 1:01 p.m. UTC | #23
On Fri, Mar 27, 2015 at 05:51:11AM -0700, Guenter Roeck wrote:
> On 03/27/2015 01:09 AM, Wolfram Sang wrote:
> >
> >>just to give you an update: I do have some code, but it is a bit messy,
> >>and it doesn't work well for ds2482 (the chip behind it still hangs up
> >>if I access it in parallel through i2c-dev). On top of that, it causes
> >>pretty significant slow-downs when accessing other devices on the same
> >>bus at the same time. Not surprising, I guess, since it expands the scope
> >>of the bus lock significantly.
> >
> >Just to get a better idea: Did you try taking the adapter_lock before
> >the two SMBus command which needed to be concatenated (and use
> >smbus_xfer directly)?
> >
> I did. I didn't use smbus_xfer directly, though, but introduced lockless
> versions of the various smbus commands, and kept using those.

And then the chip still hangs? Or was that the performance penalty here?
Guenter Roeck March 27, 2015, 1:14 p.m. UTC | #24
On 03/27/2015 06:01 AM, Wolfram Sang wrote:
> On Fri, Mar 27, 2015 at 05:51:11AM -0700, Guenter Roeck wrote:
>> On 03/27/2015 01:09 AM, Wolfram Sang wrote:
>>>
>>>> just to give you an update: I do have some code, but it is a bit messy,
>>>> and it doesn't work well for ds2482 (the chip behind it still hangs up
>>>> if I access it in parallel through i2c-dev). On top of that, it causes
>>>> pretty significant slow-downs when accessing other devices on the same
>>>> bus at the same time. Not surprising, I guess, since it expands the scope
>>>> of the bus lock significantly.
>>>
>>> Just to get a better idea: Did you try taking the adapter_lock before
>>> the two SMBus command which needed to be concatenated (and use
>>> smbus_xfer directly)?
>>>
>> I did. I didn't use smbus_xfer directly, though, but introduced lockless
>> versions of the various smbus commands, and kept using those.
>
> And then the chip still hangs? Or was that the performance penalty here?
>
Parallel access to a second eeprom chip on the same bus was much slower
than before.

Also, the new code did not solve the problem for ds2482 (completely unrelated
to the at24 driver of course). Even with proper locking, the chip ended up
hanging after some parallel accesses through i2c-dev. Granted, ds2482 is
a difficult beast, but it is still annoying how access through i2c-dev
can mess it up.

The latter is what bothered me more: What is the point of all this if we
still can not ensure correct operation ?

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wolfram Sang March 27, 2015, 3:27 p.m. UTC | #25
On Fri, Mar 27, 2015 at 06:14:28AM -0700, Guenter Roeck wrote:
> On 03/27/2015 06:01 AM, Wolfram Sang wrote:
> >On Fri, Mar 27, 2015 at 05:51:11AM -0700, Guenter Roeck wrote:
> >>On 03/27/2015 01:09 AM, Wolfram Sang wrote:
> >>>
> >>>>just to give you an update: I do have some code, but it is a bit messy,
> >>>>and it doesn't work well for ds2482 (the chip behind it still hangs up
> >>>>if I access it in parallel through i2c-dev). On top of that, it causes
> >>>>pretty significant slow-downs when accessing other devices on the same
> >>>>bus at the same time. Not surprising, I guess, since it expands the scope
> >>>>of the bus lock significantly.
> >>>
> >>>Just to get a better idea: Did you try taking the adapter_lock before
> >>>the two SMBus command which needed to be concatenated (and use
> >>>smbus_xfer directly)?
> >>>
> >>I did. I didn't use smbus_xfer directly, though, but introduced lockless
> >>versions of the various smbus commands, and kept using those.
> >
> >And then the chip still hangs? Or was that the performance penalty here?
> >
> Parallel access to a second eeprom chip on the same bus was much slower
> than before.

Interesting. I wonder what is the reason, I would have expected just a
small delay. Would you mind sending the patches for the non-locked smbus
routines? Would be nice to have that around in case I or someone else
find some time to try as well.

> Also, the new code did not solve the problem for ds2482 (completely unrelated
> to the at24 driver of course). Even with proper locking, the chip ended up
> hanging after some parallel accesses through i2c-dev. Granted, ds2482 is
> a difficult beast, but it is still annoying how access through i2c-dev
> can mess it up.

I assume you basically replaced the access_lock with the adapter_lock
with this one?

> 
> The latter is what bothered me more: What is the point of all this if we
> still can not ensure correct operation ?

Yeah, this is not good at all.

How do you use i2c-dev BTW? i2c_rdwr_msgs? What about iterating over all
msgs in that and check for busy addresses?
Guenter Roeck March 27, 2015, 3:42 p.m. UTC | #26
On 03/27/2015 08:27 AM, Wolfram Sang wrote:
> On Fri, Mar 27, 2015 at 06:14:28AM -0700, Guenter Roeck wrote:
>> On 03/27/2015 06:01 AM, Wolfram Sang wrote:
>>> On Fri, Mar 27, 2015 at 05:51:11AM -0700, Guenter Roeck wrote:
>>>> On 03/27/2015 01:09 AM, Wolfram Sang wrote:
>>>>>
>>>>>> just to give you an update: I do have some code, but it is a bit messy,
>>>>>> and it doesn't work well for ds2482 (the chip behind it still hangs up
>>>>>> if I access it in parallel through i2c-dev). On top of that, it causes
>>>>>> pretty significant slow-downs when accessing other devices on the same
>>>>>> bus at the same time. Not surprising, I guess, since it expands the scope
>>>>>> of the bus lock significantly.
>>>>>
>>>>> Just to get a better idea: Did you try taking the adapter_lock before
>>>>> the two SMBus command which needed to be concatenated (and use
>>>>> smbus_xfer directly)?
>>>>>
>>>> I did. I didn't use smbus_xfer directly, though, but introduced lockless
>>>> versions of the various smbus commands, and kept using those.
>>>
>>> And then the chip still hangs? Or was that the performance penalty here?
>>>
>> Parallel access to a second eeprom chip on the same bus was much slower
>> than before.
>
> Interesting. I wonder what is the reason, I would have expected just a
> small delay. Would you mind sending the patches for the non-locked smbus
> routines? Would be nice to have that around in case I or someone else
> find some time to try as well.
>
I pushed it into my linux repository at github (https://github.com/groeck/linux,
branch at24).

>> Also, the new code did not solve the problem for ds2482 (completely unrelated
>> to the at24 driver of course). Even with proper locking, the chip ended up
>> hanging after some parallel accesses through i2c-dev. Granted, ds2482 is
>> a difficult beast, but it is still annoying how access through i2c-dev
>> can mess it up.
>
> I assume you basically replaced the access_lock with the adapter_lock
> with this one?
>
yes.

>>
>> The latter is what bothered me more: What is the point of all this if we
>> still can not ensure correct operation ?
>
> Yeah, this is not good at all.
>
> How do you use i2c-dev BTW? i2c_rdwr_msgs? What about iterating over all
> msgs in that and check for busy addresses?
>
In this case, I just used i2cdump from one session while accessing
the chip from another session using the driver.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/misc/eeprom/at24.c b/drivers/misc/eeprom/at24.c
index 2d3db81..057d35c 100644
--- a/drivers/misc/eeprom/at24.c
+++ b/drivers/misc/eeprom/at24.c
@@ -198,6 +198,8 @@  static ssize_t at24_eeprom_read(struct at24_data *at24, char *buf,
 	case I2C_SMBUS_BYTE_DATA:
 		count = 1;
 		break;
+	case I2C_SMBUS_BYTE:
+		break;
 	default:
 		/*
 		 * When we have a better choice than SMBus calls, use a
@@ -249,6 +251,27 @@  static ssize_t at24_eeprom_read(struct at24_data *at24, char *buf,
 				status = count;
 			}
 			break;
+		case I2C_SMBUS_BYTE:
+			/*
+			 * 16-bit data address. Write data address as separate
+			 * write operation, then read data without setting
+			 * the address again. This is not multi-master safe,
+			 * but the best we can do.
+			 */
+			status = i2c_smbus_write_byte_data(client,
+							   offset >> 8,
+							   offset & 0xff);
+			if (status < 0)
+				break;
+			for (i = 0; i < count; i++) {
+				status = i2c_smbus_read_byte(client);
+				if (status < 0)
+					break;
+				buf[i] = status;
+			}
+			if (status >= 0)
+				status = count;
+			break;
 		default:
 			status = i2c_transfer(client->adapter, msg, 2);
 			if (status == 2)
@@ -372,6 +395,16 @@  static ssize_t at24_eeprom_write(struct at24_data *at24, const char *buf,
 				status = i2c_smbus_write_i2c_block_data(client,
 						offset, count, buf);
 				break;
+			case I2C_SMBUS_WORD_DATA:
+				/*
+				 * 16-bit data address. Transmit data address
+				 * MSB as SMBus command, data address LSB as
+				 * first data byte.
+				 */
+				status = i2c_smbus_write_word_data(client,
+					offset >> 8,
+					(offset & 0xff) | (buf[0] << 8));
+				break;
 			case I2C_SMBUS_BYTE_DATA:
 				status = i2c_smbus_write_byte_data(client,
 						offset, buf[0]);
@@ -540,10 +573,13 @@  static int at24_probe(struct i2c_client *client, const struct i2c_device_id *id)
 
 	/* Use I2C operations unless we're stuck with SMBus extensions. */
 	if (!i2c_check_functionality(client->adapter, I2C_FUNC_I2C)) {
-		if (chip.flags & AT24_FLAG_ADDR16)
-			return -EPFNOSUPPORT;
-
-		if (i2c_check_functionality(client->adapter,
+		if (chip.flags & AT24_FLAG_ADDR16) {
+			if (!i2c_check_functionality(client->adapter,
+						I2C_FUNC_SMBUS_WRITE_BYTE_DATA |
+						I2C_FUNC_SMBUS_READ_BYTE))
+				return -EPFNOSUPPORT;
+			use_smbus = I2C_SMBUS_BYTE;
+		} else if (i2c_check_functionality(client->adapter,
 				I2C_FUNC_SMBUS_READ_I2C_BLOCK)) {
 			use_smbus = I2C_SMBUS_I2C_BLOCK_DATA;
 		} else if (i2c_check_functionality(client->adapter,
@@ -559,7 +595,13 @@  static int at24_probe(struct i2c_client *client, const struct i2c_device_id *id)
 
 	/* Use I2C operations unless we're stuck with SMBus extensions. */
 	if (!i2c_check_functionality(client->adapter, I2C_FUNC_I2C)) {
-		if (i2c_check_functionality(client->adapter,
+		if (chip.flags & AT24_FLAG_ADDR16) {
+			if (i2c_check_functionality(client->adapter,
+					I2C_FUNC_SMBUS_WRITE_WORD_DATA)) {
+				use_smbus_write = I2C_SMBUS_WORD_DATA;
+				chip.page_size = 1;
+			}
+		} else if (i2c_check_functionality(client->adapter,
 				I2C_FUNC_SMBUS_WRITE_I2C_BLOCK)) {
 			use_smbus_write = I2C_SMBUS_I2C_BLOCK_DATA;
 		} else if (i2c_check_functionality(client->adapter,