Patchwork [9/9] I2C: mv64xxx: fix race between FSM/interrupt and process context

login
register
mail settings
Submitter Russell King
Date May 16, 2013, 8:39 p.m.
Message ID <E1Ud4xE-0004KB-2T@rmk-PC.arm.linux.org.uk>
Download mbox | patch
Permalink /patch/244422/
State Accepted
Headers show

Comments

Russell King - May 16, 2013, 8:39 p.m.
Asking for a multi-part message to be handled by this driver is racy; it
has been observed that the following sequence is possible with this
driver:

	- send start
	- send address + write
	- send data
	- send (repeated) start
	- send address + write
	- send (repeated) start
	- send address + read
	- unrecoverable bus hang (except by system reset)

The problem is that the interrupt handling sees the next event after the
first repeated start is sent - the IFLG bit is set in the register even
though INTEN is disabled.

Let's fix this by moving all of the message processing into interrupt
context, rather than having it partly in IRQ and partly in process
context.  This allows us to move immediately to the next message in the
interrupt handler and get on with the transfer, rather than incuring a
couple of scheduling switches to get the next message.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 drivers/i2c/busses/i2c-mv64xxx.c |   54 ++++++++++++++++++++++++--------------
 1 files changed, 34 insertions(+), 20 deletions(-)
Wolfram Sang - May 17, 2013, 9:51 a.m.
> @@ -271,12 +273,25 @@ mv64xxx_i2c_do_action(struct mv64xxx_i2c_data *drv_data)
>  {
>  	switch(drv_data->action) {
>  	case MV64XXX_I2C_ACTION_SEND_RESTART:
> +		/* We should only get here if we have further messages */
> +		BUG_ON(drv_data->num_msgs == 0);
> +

...

> @@ -453,16 +463,20 @@ static int
>  mv64xxx_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg msgs[], int num)
>  {
>  	struct mv64xxx_i2c_data *drv_data = i2c_get_adapdata(adap);
> -	int	i, rc;
> +	int rc, ret = num;
>  
> -	for (i = 0; i < num; i++) {
> -		rc = mv64xxx_i2c_execute_msg(drv_data, &msgs[i],
> -						i == 0, i + 1 == num);
> -		if (rc < 0)
> -			return rc;
> -	}
> +	BUG_ON(drv_data->msgs != NULL);

Can't we handle this more gracefully than to halt the kernel? Like
WARN_ON and resetting the controller or disabling the bus or...

--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Russell King - ARM Linux - May 17, 2013, 10 a.m.
On Fri, May 17, 2013 at 11:51:51AM +0200, Wolfram Sang wrote:
> >  {
> >  	switch(drv_data->action) {
> >  	case MV64XXX_I2C_ACTION_SEND_RESTART:
> > +		/* We should only get here if we have further messages */
> > +		BUG_ON(drv_data->num_msgs == 0);
> > +
> 
> ...
> 
> > @@ -453,16 +463,20 @@ static int
> >  mv64xxx_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg msgs[], int num)
> >  {
> >  	struct mv64xxx_i2c_data *drv_data = i2c_get_adapdata(adap);
> > -	int	i, rc;
> > +	int rc, ret = num;
> >  
> > -	for (i = 0; i < num; i++) {
> > -		rc = mv64xxx_i2c_execute_msg(drv_data, &msgs[i],
> > -						i == 0, i + 1 == num);
> > -		if (rc < 0)
> > -			return rc;
> > -	}
> > +	BUG_ON(drv_data->msgs != NULL);
> 
> Can't we handle this more gracefully than to halt the kernel? Like
> WARN_ON and resetting the controller or disabling the bus or...

Well, the latter really is something which should never ever happen,
and if it does happen it can only really be because something's
screwed up in the locking in the I2C layer.

The former is more probable, and I thought about that, but I don't
have any good alternative solution.  Given the problems I've seen,
I don't think resetting the controller is really an option, because
that'll likely cause the bus to lock (that's the original problem
which I'm trying to solve in this patch.)  The thing really does
have to work according to the I2C protocol otherwise stuff goes
irrecoverably wrong to the point of needing an entire system reset.
--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wolfram Sang - May 17, 2013, 12:15 p.m.
On Fri, May 17, 2013 at 11:00:16AM +0100, Russell King - ARM Linux wrote:
> On Fri, May 17, 2013 at 11:51:51AM +0200, Wolfram Sang wrote:
> > >  {
> > >  	switch(drv_data->action) {
> > >  	case MV64XXX_I2C_ACTION_SEND_RESTART:
> > > +		/* We should only get here if we have further messages */
> > > +		BUG_ON(drv_data->num_msgs == 0);
> > > +
> > 
> > ...
> > 
> > > @@ -453,16 +463,20 @@ static int
> > >  mv64xxx_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg msgs[], int num)
> > >  {
> > >  	struct mv64xxx_i2c_data *drv_data = i2c_get_adapdata(adap);
> > > -	int	i, rc;
> > > +	int rc, ret = num;
> > >  
> > > -	for (i = 0; i < num; i++) {
> > > -		rc = mv64xxx_i2c_execute_msg(drv_data, &msgs[i],
> > > -						i == 0, i + 1 == num);
> > > -		if (rc < 0)
> > > -			return rc;
> > > -	}
> > > +	BUG_ON(drv_data->msgs != NULL);
> > 
> > Can't we handle this more gracefully than to halt the kernel? Like
> > WARN_ON and resetting the controller or disabling the bus or...
> 
> Well, the latter really is something which should never ever happen,
> and if it does happen it can only really be because something's
> screwed up in the locking in the I2C layer.

I'd think we should trust the layer here.

> The former is more probable, and I thought about that, but I don't
> have any good alternative solution.  Given the problems I've seen,
> I don't think resetting the controller is really an option, because
> that'll likely cause the bus to lock (that's the original problem
> which I'm trying to solve in this patch.)  The thing really does
> have to work according to the I2C protocol otherwise stuff goes
> irrecoverably wrong to the point of needing an entire system reset.

Fine with me for now. If somebody later has a setup where I2C slaves can
be reset (e.g. via GPIO), so a complete bus reset is possible, we might
need another solution, then.

Thanks,

   Wolfram
--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/i2c/busses/i2c-mv64xxx.c b/drivers/i2c/busses/i2c-mv64xxx.c
index 7457ef5..a2e4633 100644
--- a/drivers/i2c/busses/i2c-mv64xxx.c
+++ b/drivers/i2c/busses/i2c-mv64xxx.c
@@ -86,6 +86,8 @@  enum {
 };
 
 struct mv64xxx_i2c_data {
+	struct i2c_msg		*msgs;
+	int			num_msgs;
 	int			irq;
 	u32			state;
 	u32			action;
@@ -194,7 +196,7 @@  mv64xxx_i2c_fsm(struct mv64xxx_i2c_data *drv_data, u32 status)
 		if ((drv_data->bytes_left == 0)
 				|| (drv_data->aborting
 					&& (drv_data->byte_posn != 0))) {
-			if (drv_data->send_stop) {
+			if (drv_data->send_stop || drv_data->aborting) {
 				drv_data->action = MV64XXX_I2C_ACTION_SEND_STOP;
 				drv_data->state = MV64XXX_I2C_STATE_IDLE;
 			} else {
@@ -271,12 +273,25 @@  mv64xxx_i2c_do_action(struct mv64xxx_i2c_data *drv_data)
 {
 	switch(drv_data->action) {
 	case MV64XXX_I2C_ACTION_SEND_RESTART:
+		/* We should only get here if we have further messages */
+		BUG_ON(drv_data->num_msgs == 0);
+
 		drv_data->cntl_bits |= MV64XXX_I2C_REG_CONTROL_START;
-		drv_data->cntl_bits &= ~MV64XXX_I2C_REG_CONTROL_INTEN;
 		writel(drv_data->cntl_bits,
 			drv_data->reg_base + MV64XXX_I2C_REG_CONTROL);
-		drv_data->block = 0;
-		wake_up(&drv_data->waitq);
+
+		drv_data->msgs++;
+		drv_data->num_msgs--;
+
+		/* Setup for the next message */
+		mv64xxx_i2c_prepare_for_io(drv_data, drv_data->msgs);
+
+		/*
+		 * We're never at the start of the message here, and by this
+		 * time it's already too late to do any protocol mangling.
+		 * Thankfully, do not advertise support for that feature.
+		 */
+		drv_data->send_stop = drv_data->num_msgs == 1;
 		break;
 
 	case MV64XXX_I2C_ACTION_CONTINUE:
@@ -412,20 +427,15 @@  mv64xxx_i2c_wait_for_completion(struct mv64xxx_i2c_data *drv_data)
 
 static int
 mv64xxx_i2c_execute_msg(struct mv64xxx_i2c_data *drv_data, struct i2c_msg *msg,
-				int is_first, int is_last)
+				int is_last)
 {
 	unsigned long	flags;
 
 	spin_lock_irqsave(&drv_data->lock, flags);
 	mv64xxx_i2c_prepare_for_io(drv_data, msg);
 
-	if (is_first) {
-		drv_data->action = MV64XXX_I2C_ACTION_SEND_START;
-		drv_data->state = MV64XXX_I2C_STATE_WAITING_FOR_START_COND;
-	} else {
-		drv_data->action = MV64XXX_I2C_ACTION_SEND_ADDR_1;
-		drv_data->state = MV64XXX_I2C_STATE_WAITING_FOR_ADDR_1_ACK;
-	}
+	drv_data->action = MV64XXX_I2C_ACTION_SEND_START;
+	drv_data->state = MV64XXX_I2C_STATE_WAITING_FOR_START_COND;
 
 	drv_data->send_stop = is_last;
 	drv_data->block = 1;
@@ -453,16 +463,20 @@  static int
 mv64xxx_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg msgs[], int num)
 {
 	struct mv64xxx_i2c_data *drv_data = i2c_get_adapdata(adap);
-	int	i, rc;
+	int rc, ret = num;
 
-	for (i = 0; i < num; i++) {
-		rc = mv64xxx_i2c_execute_msg(drv_data, &msgs[i],
-						i == 0, i + 1 == num);
-		if (rc < 0)
-			return rc;
-	}
+	BUG_ON(drv_data->msgs != NULL);
+	drv_data->msgs = msgs;
+	drv_data->num_msgs = num;
+
+	rc = mv64xxx_i2c_execute_msg(drv_data, &msgs[0], num == 1);
+	if (rc < 0)
+		ret = rc;
+
+	drv_data->num_msgs = 0;
+	drv_data->msgs = NULL;
 
-	return num;
+	return ret;
 }
 
 static const struct i2c_algorithm mv64xxx_i2c_algo = {