Message ID | 1379093833-4949-2-git-send-email-nautsch2@gmail.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
On 13.09.2013 19:37, Andre Naujoks wrote: > The locking is needed, since the the internal buffer for the CAN frames is > changed during the wakeup call. This could cause buffer inconsistencies > under high loads, especially for the outgoing short CAN packet skbuffs. > > The needed locks led to deadlocks before commit > "5ede52538ee2b2202d9dff5b06c33bfde421e6e4 tty: Remove extra wakeup from pty > write() path", which removed the direct callback to the wakeup function from the > tty layer. > > As slcan.c is based on slip.c the issue in the original code is fixed, too. > > Signed-off-by: Andre Naujoks <nautsch2@gmail.com> At least for slcan.c: Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Tnx for figuring that out with your heavy load testing. Best regards, Oliver > --- > drivers/net/can/slcan.c | 3 +++ > drivers/net/slip/slip.c | 3 +++ > 2 files changed, 6 insertions(+) > > diff --git a/drivers/net/can/slcan.c b/drivers/net/can/slcan.c > index 874188b..d571e2e 100644 > --- a/drivers/net/can/slcan.c > +++ b/drivers/net/can/slcan.c > @@ -286,11 +286,13 @@ static void slcan_write_wakeup(struct tty_struct *tty) > if (!sl || sl->magic != SLCAN_MAGIC || !netif_running(sl->dev)) > return; > > + spin_lock(&sl->lock); > if (sl->xleft <= 0) { > /* Now serial buffer is almost free & we can start > * transmission of another packet */ > sl->dev->stats.tx_packets++; > clear_bit(TTY_DO_WRITE_WAKEUP, &tty->flags); > + spin_unlock(&sl->lock); > netif_wake_queue(sl->dev); > return; > } > @@ -298,6 +300,7 @@ static void slcan_write_wakeup(struct tty_struct *tty) > actual = tty->ops->write(tty, sl->xhead, sl->xleft); > sl->xleft -= actual; > sl->xhead += actual; > + spin_unlock(&sl->lock); > } > > /* Send a can_frame to a TTY queue. */ > diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c > index a34d6bf..cc70ecf 100644 > --- a/drivers/net/slip/slip.c > +++ b/drivers/net/slip/slip.c > @@ -429,11 +429,13 @@ static void slip_write_wakeup(struct tty_struct *tty) > if (!sl || sl->magic != SLIP_MAGIC || !netif_running(sl->dev)) > return; > > + spin_lock(&sl->lock); > if (sl->xleft <= 0) { > /* Now serial buffer is almost free & we can start > * transmission of another packet */ > sl->dev->stats.tx_packets++; > clear_bit(TTY_DO_WRITE_WAKEUP, &tty->flags); > + spin_unlock(&sl->lock); > sl_unlock(sl); > return; > } > @@ -441,6 +443,7 @@ static void slip_write_wakeup(struct tty_struct *tty) > actual = tty->ops->write(tty, sl->xhead, sl->xleft); > sl->xleft -= actual; > sl->xhead += actual; > + spin_unlock(&sl->lock); > } > > static void sl_tx_timeout(struct net_device *dev) > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/13/2013 07:37 PM, Andre Naujoks wrote: > The locking is needed, since the the internal buffer for the CAN frames is > changed during the wakeup call. This could cause buffer inconsistencies > under high loads, especially for the outgoing short CAN packet skbuffs. > > The needed locks led to deadlocks before commit > "5ede52538ee2b2202d9dff5b06c33bfde421e6e4 tty: Remove extra wakeup from pty > write() path", which removed the direct callback to the wakeup function from the > tty layer. What does that mean for older kernels? (< 5ede52538ee2b2202d9dff5b06c33bfde421e6e4) > As slcan.c is based on slip.c the issue in the original code is fixed, too. > > Signed-off-by: Andre Naujoks <nautsch2@gmail.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Marc
On 19.09.2013 11:36, schrieb Marc Kleine-Budde: > On 09/13/2013 07:37 PM, Andre Naujoks wrote: >> The locking is needed, since the the internal buffer for the CAN >> frames is changed during the wakeup call. This could cause buffer >> inconsistencies under high loads, especially for the outgoing >> short CAN packet skbuffs. >> >> The needed locks led to deadlocks before commit >> "5ede52538ee2b2202d9dff5b06c33bfde421e6e4 tty: Remove extra >> wakeup from pty write() path", which removed the direct callback >> to the wakeup function from the tty layer. > > What does that mean for older kernels? (< > 5ede52538ee2b2202d9dff5b06c33bfde421e6e4) It seems the slcan (and slip) driver is broken for older kernels. See this thread for a discussion about the patch in pty.c. http://marc.info/?l=linux-kernel&m=137269017002789&w=2 The patch from Peter Hurley was actually already in the queue, when I ran into the problem, and is now in kernel 3.12. Without the pty patch and slow CAN traffic, the driver works, because the wakeup is called directly from the pty driver. That is also the reason why there was no locking. It would just deadlock. When the pty driver defers the wakeup, we ran into synchronisation problems (which should be fixed by the locking) and eventually into a kernel panic because of a recursive loop (which should be fixed by the pty.c patch). Maybe it is possible to get both patches back into the stable branches? Regards Andre > >> As slcan.c is based on slip.c the issue in the original code is >> fixed, too. >> >> Signed-off-by: Andre Naujoks <nautsch2@gmail.com> > Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> > > Marc > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/19/2013 12:29 PM, Andre Naujoks wrote: > On 19.09.2013 11:36, schrieb Marc Kleine-Budde: >> On 09/13/2013 07:37 PM, Andre Naujoks wrote: >>> The locking is needed, since the the internal buffer for the CAN >>> frames is changed during the wakeup call. This could cause buffer >>> inconsistencies under high loads, especially for the outgoing >>> short CAN packet skbuffs. >>> >>> The needed locks led to deadlocks before commit >>> "5ede52538ee2b2202d9dff5b06c33bfde421e6e4 tty: Remove extra >>> wakeup from pty write() path", which removed the direct callback >>> to the wakeup function from the tty layer. >> >> What does that mean for older kernels? (< >> 5ede52538ee2b2202d9dff5b06c33bfde421e6e4) > > It seems the slcan (and slip) driver is broken for older kernels. See > this thread for a discussion about the patch in pty.c. > > http://marc.info/?l=linux-kernel&m=137269017002789&w=2 Thanks for the info. > The patch from Peter Hurley was actually already in the queue, when I > ran into the problem, and is now in kernel 3.12. > > Without the pty patch and slow CAN traffic, the driver works, because > the wakeup is called directly from the pty driver. That is also the > reason why there was no locking. It would just deadlock. > > When the pty driver defers the wakeup, we ran into synchronisation > problems (which should be fixed by the locking) and eventually into a > kernel panic because of a recursive loop (which should be fixed by the > pty.c patch). > > Maybe it is possible to get both patches back into the stable branches? Sounds reasonable. You might get in touch with Peter Hurley, if his patch is scheduled for stable. Documentation/stable_kernel_rules.txt suggests a procedure if your patch depends on others to be cherry picked. Marc
[ +cc Greg Kroah-Hartman] On 09/19/2013 06:35 AM, Marc Kleine-Budde wrote: > On 09/19/2013 12:29 PM, Andre Naujoks wrote: >> On 19.09.2013 11:36, schrieb Marc Kleine-Budde: >>> On 09/13/2013 07:37 PM, Andre Naujoks wrote: >>>> The locking is needed, since the the internal buffer for the CAN >>>> frames is changed during the wakeup call. This could cause buffer >>>> inconsistencies under high loads, especially for the outgoing >>>> short CAN packet skbuffs. >>>> >>>> The needed locks led to deadlocks before commit >>>> "5ede52538ee2b2202d9dff5b06c33bfde421e6e4 tty: Remove extra >>>> wakeup from pty write() path", which removed the direct callback >>>> to the wakeup function from the tty layer. >>> >>> What does that mean for older kernels? (< >>> 5ede52538ee2b2202d9dff5b06c33bfde421e6e4) >> >> It seems the slcan (and slip) driver is broken for older kernels. See >> this thread for a discussion about the patch in pty.c. >> >> http://marc.info/?l=linux-kernel&m=137269017002789&w=2 > > Thanks for the info. > >> The patch from Peter Hurley was actually already in the queue, when I >> ran into the problem, and is now in kernel 3.12. >> >> Without the pty patch and slow CAN traffic, the driver works, because >> the wakeup is called directly from the pty driver. That is also the >> reason why there was no locking. It would just deadlock. >> >> When the pty driver defers the wakeup, we ran into synchronisation >> problems (which should be fixed by the locking) and eventually into a >> kernel panic because of a recursive loop (which should be fixed by the >> pty.c patch). >> >> Maybe it is possible to get both patches back into the stable branches? > > Sounds reasonable. You might get in touch with Peter Hurley, if his > patch is scheduled for stable. Documentation/stable_kernel_rules.txt > suggests a procedure if your patch depends on others to be cherry picked. Already following along. I'd like to wait for 3.12 release before the pty patch goes to -stable (so that it gets more in-the-wild testing). Regards, Peter Hurley -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/can/slcan.c b/drivers/net/can/slcan.c index 874188b..d571e2e 100644 --- a/drivers/net/can/slcan.c +++ b/drivers/net/can/slcan.c @@ -286,11 +286,13 @@ static void slcan_write_wakeup(struct tty_struct *tty) if (!sl || sl->magic != SLCAN_MAGIC || !netif_running(sl->dev)) return; + spin_lock(&sl->lock); if (sl->xleft <= 0) { /* Now serial buffer is almost free & we can start * transmission of another packet */ sl->dev->stats.tx_packets++; clear_bit(TTY_DO_WRITE_WAKEUP, &tty->flags); + spin_unlock(&sl->lock); netif_wake_queue(sl->dev); return; } @@ -298,6 +300,7 @@ static void slcan_write_wakeup(struct tty_struct *tty) actual = tty->ops->write(tty, sl->xhead, sl->xleft); sl->xleft -= actual; sl->xhead += actual; + spin_unlock(&sl->lock); } /* Send a can_frame to a TTY queue. */ diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c index a34d6bf..cc70ecf 100644 --- a/drivers/net/slip/slip.c +++ b/drivers/net/slip/slip.c @@ -429,11 +429,13 @@ static void slip_write_wakeup(struct tty_struct *tty) if (!sl || sl->magic != SLIP_MAGIC || !netif_running(sl->dev)) return; + spin_lock(&sl->lock); if (sl->xleft <= 0) { /* Now serial buffer is almost free & we can start * transmission of another packet */ sl->dev->stats.tx_packets++; clear_bit(TTY_DO_WRITE_WAKEUP, &tty->flags); + spin_unlock(&sl->lock); sl_unlock(sl); return; } @@ -441,6 +443,7 @@ static void slip_write_wakeup(struct tty_struct *tty) actual = tty->ops->write(tty, sl->xhead, sl->xleft); sl->xleft -= actual; sl->xhead += actual; + spin_unlock(&sl->lock); } static void sl_tx_timeout(struct net_device *dev)
The locking is needed, since the the internal buffer for the CAN frames is changed during the wakeup call. This could cause buffer inconsistencies under high loads, especially for the outgoing short CAN packet skbuffs. The needed locks led to deadlocks before commit "5ede52538ee2b2202d9dff5b06c33bfde421e6e4 tty: Remove extra wakeup from pty write() path", which removed the direct callback to the wakeup function from the tty layer. As slcan.c is based on slip.c the issue in the original code is fixed, too. Signed-off-by: Andre Naujoks <nautsch2@gmail.com> --- drivers/net/can/slcan.c | 3 +++ drivers/net/slip/slip.c | 3 +++ 2 files changed, 6 insertions(+)