Revert "i2c: cadance: fix ctrl/addr reg write order"

Message ID 20181206173306.31556-1-kyle.roeschley@ni.com
State New
Headers show
Series
  • Revert "i2c: cadance: fix ctrl/addr reg write order"
Related show

Commit Message

Kyle Roeschley Dec. 6, 2018, 5:33 p.m.
This reverts commit 8064c616984eaa015f018dba595d78cd24a0cc8c.

According to the Zynq TRM §20.2.2 under heading Read Transfer, HOLD should
be set or cleared before writing the I2C Address register. This fixes
sporadic i2c bus lockups on National Instruments Zynq-based devices.

Signed-off-by: Kyle Roeschley <kyle.roeschley@ni.com>
---
Tested by flooding the bus with traffic on a National Instruments cRIO-9068.
Previously the bus would lock up after ~30s, now I can run overnight with no
problems.

 drivers/i2c/busses/i2c-cadence.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Michal Simek Dec. 7, 2018, 9:15 a.m. | #1
On 06. 12. 18 18:33, Kyle Roeschley wrote:
> This reverts commit 8064c616984eaa015f018dba595d78cd24a0cc8c.
> 
> According to the Zynq TRM §20.2.2 under heading Read Transfer, HOLD should
> be set or cleared before writing the I2C Address register. This fixes
> sporadic i2c bus lockups on National Instruments Zynq-based devices.
> 
> Signed-off-by: Kyle Roeschley <kyle.roeschley@ni.com>
> ---
> Tested by flooding the bus with traffic on a National Instruments cRIO-9068.
> Previously the bus would lock up after ~30s, now I can run overnight with no
> problems.

Can you please share that app you use to be able to replicate it on our
side?

Thanks,
Michal
Kyle Roeschley Dec. 7, 2018, 10:22 p.m. | #2
On Fri, Dec 07, 2018 at 10:15:11AM +0100, Michal Simek wrote:
> On 06. 12. 18 18:33, Kyle Roeschley wrote:
> > This reverts commit 8064c616984eaa015f018dba595d78cd24a0cc8c.
> > 
> > According to the Zynq TRM §20.2.2 under heading Read Transfer, HOLD should
> > be set or cleared before writing the I2C Address register. This fixes
> > sporadic i2c bus lockups on National Instruments Zynq-based devices.
> > 
> > Signed-off-by: Kyle Roeschley <kyle.roeschley@ni.com>
> > ---
> > Tested by flooding the bus with traffic on a National Instruments cRIO-9068.
> > Previously the bus would lock up after ~30s, now I can run overnight with no
> > problems.
> 
> Can you please share that app you use to be able to replicate it on our
> side?

I've seen the problem in two cases, neither of which are very easy to
replicate:

1. Continuously rebooting a cRIO-9068 running an NI Linux kernel. This was how
   we caught the error originally, but it unfortunately takes 10+ hours to
   fail. In this case, I saw a timeout error caused by a two-message transfer
   immediately on load of the nizynqcpld driver [1]. Linked is our 4.9 tree
   because its our latest released branch, but we're currently testing on top
   of 4.14.
2. Writing on the status LED on a cRIO-9068 in a loop. This only takes a 30s to
   a couple minutes to fail, in which case I see the same timeout error from
   i2c-cadence and the system hangs. The script I used is below [2].

I'm working on a better MWE (that doesn't use our out of tree driver), but I
don't see the same timeout error when initiating the same transfers in a user
mode C application via i2c-dev.

Do you know why the patch I'm reverting seems to conflict with the Zynq TRM?
My only guess is that the "IP datasheet" mentioned in the commit message
referes to the Cadence I2C IP manual, but I don't have access to that document.

[1] https://github.com/ni/linux/blob/nilrt/18.0/4.9/drivers/misc/nizynqcpld.c#L1569

[2] LED test script:
#!/bin/bash
set -e

LED="/sys/class/leds/nilrt:status:yellow/brightness"
i=0
while (( 1 ))
do
	echo $i
	echo 99999 > $LED
	echo 0 > $LED
	i=$((i+1))
done
Kyle Roeschley Dec. 12, 2018, 6:59 p.m. | #3
On Fri, Dec 07, 2018 at 04:22:38PM -0600, Kyle Roeschley wrote:
> On Fri, Dec 07, 2018 at 10:15:11AM +0100, Michal Simek wrote:
> > On 06. 12. 18 18:33, Kyle Roeschley wrote:
> > > This reverts commit 8064c616984eaa015f018dba595d78cd24a0cc8c.
> > > 
> > > According to the Zynq TRM §20.2.2 under heading Read Transfer, HOLD should
> > > be set or cleared before writing the I2C Address register. This fixes
> > > sporadic i2c bus lockups on National Instruments Zynq-based devices.
> > > 
> > > Signed-off-by: Kyle Roeschley <kyle.roeschley@ni.com>
> > > ---
> > > Tested by flooding the bus with traffic on a National Instruments cRIO-9068.
> > > Previously the bus would lock up after ~30s, now I can run overnight with no
> > > problems.
> > 
> > Can you please share that app you use to be able to replicate it on our
> > side?
> 
> I've seen the problem in two cases, neither of which are very easy to
> replicate:
> 
> 1. Continuously rebooting a cRIO-9068 running an NI Linux kernel. This was how
>    we caught the error originally, but it unfortunately takes 10+ hours to
>    fail. In this case, I saw a timeout error caused by a two-message transfer
>    immediately on load of the nizynqcpld driver [1]. Linked is our 4.9 tree
>    because its our latest released branch, but we're currently testing on top
>    of 4.14.
> 2. Writing on the status LED on a cRIO-9068 in a loop. This only takes a 30s to
>    a couple minutes to fail, in which case I see the same timeout error from
>    i2c-cadence and the system hangs. The script I used is below [2].
> 
> I'm working on a better MWE (that doesn't use our out of tree driver), but I
> don't see the same timeout error when initiating the same transfers in a user
> mode C application via i2c-dev.

Debugging further, I see the timeout when I do a write and a read in one
transfer [1]. It doesn't reproduce in my user mode application because i2c-dev
doesn't allow me to initiate a single transfer with a read and a write
back-to-back. If I split up nizynqcpld_read() into two separate transfers, it
appears to work fine (ran overnight without a timeout).

> Do you know why the patch I'm reverting seems to conflict with the Zynq TRM?
> My only guess is that the "IP datasheet" mentioned in the commit message
> referes to the Cadence I2C IP manual, but I don't have access to that document.

[1] https://github.com/ni/linux/blob/nilrt/18.0/4.9/drivers/misc/nizynqcpld.c#L335
Wolfram Sang Dec. 12, 2018, 9:29 p.m. | #4
> Debugging further, I see the timeout when I do a write and a read in one
> transfer [1]. It doesn't reproduce in my user mode application because i2c-dev
> doesn't allow me to initiate a single transfer with a read and a write
> back-to-back.

You can get a recent version of i2c-tools and use 'i2ctransfer' for such
transfers.

Patch

diff --git a/drivers/i2c/busses/i2c-cadence.c b/drivers/i2c/busses/i2c-cadence.c
index e39d002a385c..f1fa84af2492 100644
--- a/drivers/i2c/busses/i2c-cadence.c
+++ b/drivers/i2c/busses/i2c-cadence.c
@@ -651,15 +651,15 @@  static void cdns_i2c_mrecv(struct cdns_i2c *id)
 		cdns_i2c_writereg(id->recv_count, CDNS_I2C_XFER_SIZE_OFFSET);
 	}
 
-	/* Set the slave address in address register - triggers operation */
-	cdns_i2c_writereg(CDNS_I2C_ENABLED_INTR_MASK, CDNS_I2C_IER_OFFSET);
-	cdns_i2c_writereg(id->p_msg->addr & CDNS_I2C_ADDR_MASK,
-						CDNS_I2C_ADDR_OFFSET);
 	/* Clear the bus hold flag if bytes to receive is less than FIFO size */
 	if (!id->bus_hold_flag &&
 		((id->p_msg->flags & I2C_M_RECV_LEN) != I2C_M_RECV_LEN) &&
 		(id->recv_count <= CDNS_I2C_FIFO_DEPTH))
 			cdns_i2c_clear_bus_hold(id);
+	/* Set the slave address in address register - triggers operation */
+	cdns_i2c_writereg(CDNS_I2C_ENABLED_INTR_MASK, CDNS_I2C_IER_OFFSET);
+	cdns_i2c_writereg(id->p_msg->addr & CDNS_I2C_ADDR_MASK,
+						CDNS_I2C_ADDR_OFFSET);
 }
 
 /**