Message ID | 20100811151257.GB23317@hmsreliant.think-freely.org |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, 11 Aug 2010 11:12:57 -0400 Neil Horman <nhorman@tuxdriver.com> wrote: > If netconsole is in use, there is a possibility for deadlock in 3c59x between > boomerang_interrupt and boomerang_start_xmit. Both routines take the vp->lock, > and if netconsole is in use, a pr_* call from the boomerang_interrupt routine > will result in the netconsole code attempting to trnasmit an skb, which can try > to take the same spin lock, resulting in de I thought we agreed that any device supporting netconsole agrees to not call printk in the transmit path. Just kill the pr_* call. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Aug 11, 2010 at 12:09:00PM -0400, Stephen Hemminger wrote: > On Wed, 11 Aug 2010 11:12:57 -0400 > Neil Horman <nhorman@tuxdriver.com> wrote: > > > If netconsole is in use, there is a possibility for deadlock in 3c59x between > > boomerang_interrupt and boomerang_start_xmit. Both routines take the vp->lock, > > and if netconsole is in use, a pr_* call from the boomerang_interrupt routine > > will result in the netconsole code attempting to trnasmit an skb, which can try > > to take the same spin lock, resulting in de > > I thought we agreed that any device supporting netconsole agrees > to not call printk in the transmit path. Just kill the pr_* call. > 3c59x isn't calling pr_* in the transmit path, its calling it in the interrupt handler, but since the interrupt handler and the transmit path share a spinlock, you get a deadlock in that case. We could kill the several pr_debug calls in the interrupt path, but I don't think thats really needed in this case. Neil -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Stephen Hemminger <shemminger@vyatta.com> wrote: > > I thought we agreed that any device supporting netconsole agrees > to not call printk in the transmit path. Just kill the pr_* call. Not calling printk from the driver doesn't help since you can still call printk from an IRQ handler. Cheers,
Le mercredi 11 août 2010 à 11:12 -0400, Neil Horman a écrit : > If netconsole is in use, there is a possibility for deadlock in 3c59x between > boomerang_interrupt and boomerang_start_xmit. Both routines take the vp->lock, > and if netconsole is in use, a pr_* call from the boomerang_interrupt routine > will result in the netconsole code attempting to trnasmit an skb, which can try > to take the same spin lock, resulting in deadlock. > > The fix is pretty straightforward. This patch allocats a bit in the 3c59x > private structure to indicate that its handling an interrupt. If we get into > the transmit routine and that bit is set, we can be sure that we have recursed > and will deadlock if we continue, so instead we just return NETDEV_TX_BUSY, so > the stack requeues the skb to try again later. > > Signed-off-by: Neil Horman <nhorman@tuxdriver.com> > --- > drivers/net/3c59x.c | 15 ++++++++++++++- > 1 files changed, 14 insertions(+), 1 deletions(-) > > diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c > index c754d88..c685a55 100644 > --- a/drivers/net/3c59x.c > +++ b/drivers/net/3c59x.c > @@ -633,7 +633,8 @@ struct vortex_private { > open:1, > medialock:1, > must_free_region:1, /* Flag: if zero, Cardbus owns the I/O region */ > - large_frames:1; /* accept large frames */ > + large_frames:1, /* accept large frames */ > + handling_irq:1; /* private in_irq indicator */ It would be safer and faster to use a dedicated 'int' instead of a bitfield. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 23, 2010 at 10:01:34PM +0200, Eric Dumazet wrote: > Le mercredi 11 août 2010 à 11:12 -0400, Neil Horman a écrit : > > If netconsole is in use, there is a possibility for deadlock in 3c59x between > > boomerang_interrupt and boomerang_start_xmit. Both routines take the vp->lock, > > and if netconsole is in use, a pr_* call from the boomerang_interrupt routine > > will result in the netconsole code attempting to trnasmit an skb, which can try > > to take the same spin lock, resulting in deadlock. > > > > The fix is pretty straightforward. This patch allocats a bit in the 3c59x > > private structure to indicate that its handling an interrupt. If we get into > > the transmit routine and that bit is set, we can be sure that we have recursed > > and will deadlock if we continue, so instead we just return NETDEV_TX_BUSY, so > > the stack requeues the skb to try again later. > > > > Signed-off-by: Neil Horman <nhorman@tuxdriver.com> > > --- > > drivers/net/3c59x.c | 15 ++++++++++++++- > > 1 files changed, 14 insertions(+), 1 deletions(-) > > > > diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c > > index c754d88..c685a55 100644 > > --- a/drivers/net/3c59x.c > > +++ b/drivers/net/3c59x.c > > @@ -633,7 +633,8 @@ struct vortex_private { > > open:1, > > medialock:1, > > must_free_region:1, /* Flag: if zero, Cardbus owns the I/O region */ > > - large_frames:1; /* accept large frames */ > > + large_frames:1, /* accept large frames */ > > + handling_irq:1; /* private in_irq indicator */ > > > It would be safer and faster to use a dedicated 'int' instead of a > bitfield. > > > > Faster I agree with, although I'm not sure if speed is really a big issue here, given that this is a ancient (but fairly well used) 10/100 adapter. And we still have space in the octet that that bitfield is living in, so I figured I'd use that anyway. As for safe, I'm not sure I follow you on that point. Is there something inherently dangerous about using a bitfield in this case? Neil -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le lundi 23 août 2010 à 16:24 -0400, Neil Horman a écrit : > > Faster I agree with, although I'm not sure if speed is really a big issue here, > given that this is a ancient (but fairly well used) 10/100 adapter. And we > still have space in the octet that that bitfield is living in, so I figured I'd > use that anyway. > > As for safe, I'm not sure I follow you on that point. Is there something > inherently dangerous about using a bitfield in this case? > A bitfield is not SMP safe. Are you sure another cpu is not changing another bit, using a non atomic RMW sequence, and your bit change is lost ? Quite tricky to check I suppose, so just add an "int" ;) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 23, 2010 at 10:48:58PM +0200, Eric Dumazet wrote: > Le lundi 23 août 2010 à 16:24 -0400, Neil Horman a écrit : > > > > > Faster I agree with, although I'm not sure if speed is really a big issue here, > > given that this is a ancient (but fairly well used) 10/100 adapter. And we > > still have space in the octet that that bitfield is living in, so I figured I'd > > use that anyway. > > > > As for safe, I'm not sure I follow you on that point. Is there something > > inherently dangerous about using a bitfield in this case? > > > > A bitfield is not SMP safe. > > Are you sure another cpu is not changing another bit, using a non atomic > RMW sequence, and your bit change is lost ? > > Quite tricky to check I suppose, so just add an "int" ;) > > > > Ah, I see what your saying. Since bitfield access is not a single instruction you get races when multiple accesses take place at once, since the assignment is non-atomic, nor is the read. In this case, thats ok. I say that because in the only time we really care what the value of this bitfield is, is when we've recursed from the interrupt handler to the netconsole module, back into the transmit path on the same cpu. In that case the read is guaranteed to be ordered after the write, since its a linear code path. In the case where cpu 0 is in the interrupt handler and cpu1 is going into the transmit method for this driver, we don't really care what the value of the bitfield is, its a don't care. If we read it as a zero, thats ok, we have the driver-internal sempahore still protecting us (the one that deadlocks if you recurse via netconsole on the same cpu). And if we read it as a 1, thats ok too, because we simply cause the network scheduler to queue the frame and send it again as soon as we're out of the interrupt handler. Theres still the size vs. speed issue with the conversion to int though. I figured since it was already using bitfields in lots of places, there wouldn't be much performance impact. But if you feel really strongly about it, let me know, and I'll make the conversion. Neil -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Neil Horman <nhorman@tuxdriver.com> Date: Mon, 23 Aug 2010 19:41:44 -0400 > Ah, I see what your saying. Since bitfield access is not a single instruction > you get races when multiple accesses take place at once, since the assignment is > non-atomic, nor is the read. > > In this case, thats ok. I say that because in the only time we really care what > the value of this bitfield is, is when we've recursed from the interrupt handler > to the netconsole module, back into the transmit path on the same cpu. In that > case the read is guaranteed to be ordered after the write, since its a linear > code path. In the case where cpu 0 is in the interrupt handler and cpu1 is > going into the transmit method for this driver, we don't really care what the > value of the bitfield is, its a don't care. If we read it as a zero, thats ok, > we have the driver-internal sempahore still protecting us (the one that > deadlocks if you recurse via netconsole on the same cpu). And if we read it as > a 1, thats ok too, because we simply cause the network scheduler to queue the > frame and send it again as soon as we're out of the interrupt handler. Right this should be OK. The only write to the bit happens with the lock held. The other bits are never modified while the device is active and interrupts can run. I've applied Neil's patch, thanks Neil. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c index c754d88..c685a55 100644 --- a/drivers/net/3c59x.c +++ b/drivers/net/3c59x.c @@ -633,7 +633,8 @@ struct vortex_private { open:1, medialock:1, must_free_region:1, /* Flag: if zero, Cardbus owns the I/O region */ - large_frames:1; /* accept large frames */ + large_frames:1, /* accept large frames */ + handling_irq:1; /* private in_irq indicator */ int drv_flags; u16 status_enable; u16 intr_enable; @@ -2133,6 +2134,15 @@ boomerang_start_xmit(struct sk_buff *skb, struct net_device *dev) dev->name, vp->cur_tx); } + /* + * We can't allow a recursion from our interrupt handler back into the + * tx routine, as they take the same spin lock, and that causes + * deadlock. Just return NETDEV_TX_BUSY and let the stack try again in + * a bit + */ + if (vp->handling_irq) + return NETDEV_TX_BUSY; + if (vp->cur_tx - vp->dirty_tx >= TX_RING_SIZE) { if (vortex_debug > 0) pr_warning("%s: BUG! Tx Ring full, refusing to send buffer.\n", @@ -2335,11 +2345,13 @@ boomerang_interrupt(int irq, void *dev_id) ioaddr = vp->ioaddr; + /* * It seems dopey to put the spinlock this early, but we could race against vortex_tx_timeout * and boomerang_start_xmit */ spin_lock(&vp->lock); + vp->handling_irq = 1; status = ioread16(ioaddr + EL3_STATUS); @@ -2447,6 +2459,7 @@ boomerang_interrupt(int irq, void *dev_id) pr_debug("%s: exiting interrupt, status %4.4x.\n", dev->name, status); handler_exit: + vp->handling_irq = 0; spin_unlock(&vp->lock); return IRQ_HANDLED; }
If netconsole is in use, there is a possibility for deadlock in 3c59x between boomerang_interrupt and boomerang_start_xmit. Both routines take the vp->lock, and if netconsole is in use, a pr_* call from the boomerang_interrupt routine will result in the netconsole code attempting to trnasmit an skb, which can try to take the same spin lock, resulting in deadlock. The fix is pretty straightforward. This patch allocats a bit in the 3c59x private structure to indicate that its handling an interrupt. If we get into the transmit routine and that bit is set, we can be sure that we have recursed and will deadlock if we continue, so instead we just return NETDEV_TX_BUSY, so the stack requeues the skb to try again later. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> --- drivers/net/3c59x.c | 15 ++++++++++++++- 1 files changed, 14 insertions(+), 1 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html