Message ID | 20100809163210.GA1781@hmsreliant.think-freely.org |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
From: Neil Horman <nhorman@tuxdriver.com> Date: Mon, 9 Aug 2010 12:32:10 -0400 > When using netpoll, its possible to deadlock the 3c59x driver. Since it takes > an internal spinlock (vp->lock) that serializes boomerang_interrupt and parts > of boomerang_start_xmit, if we call pr_debug in the former, we can go through > the tx path on the same cpu, and recurse into the same driver again, deadlocking > in the transmit routine. > > This patch fixes that problem by stopping the queues during interrupt > processing, so that subsequent transmits will get queued until a later point. > Its not a great solution, but we need to find some way to serialize access to > the register file on the card without enforcing a deadlock. I think the queue > stop is the best way to do that. And since we only print things in > boomerang_interrupt when we have debug enabled, we can mitigate the impact of > this change to only stop the queues when debug is on. > > Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Nothing serialized changes to "vortex_debug" with the tests you're making here. Simply turning it off when a packet arrives can deadlock the send queue of the device. Even if proper serializatio did exist, I still see this as an awful solution. And the default value of this thing is "1" so it's always going to be doing this send queue state flipping for effectively everyone. Please find another way to solve this problem. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Aug 10, 2010 at 04:48:03PM -0700, David Miller wrote: > From: Neil Horman <nhorman@tuxdriver.com> > Date: Mon, 9 Aug 2010 12:32:10 -0400 > > > When using netpoll, its possible to deadlock the 3c59x driver. Since it takes > > an internal spinlock (vp->lock) that serializes boomerang_interrupt and parts > > of boomerang_start_xmit, if we call pr_debug in the former, we can go through > > the tx path on the same cpu, and recurse into the same driver again, deadlocking > > in the transmit routine. > > > > This patch fixes that problem by stopping the queues during interrupt > > processing, so that subsequent transmits will get queued until a later point. > > Its not a great solution, but we need to find some way to serialize access to > > the register file on the card without enforcing a deadlock. I think the queue > > stop is the best way to do that. And since we only print things in > > boomerang_interrupt when we have debug enabled, we can mitigate the impact of > > this change to only stop the queues when debug is on. > > > > Signed-off-by: Neil Horman <nhorman@tuxdriver.com> > > Nothing serialized changes to "vortex_debug" with the tests you're > making here. Simply turning it off when a packet arrives can deadlock > the send queue of the device. > > Even if proper serializatio did exist, I still see this as an awful > solution. > > And the default value of this thing is "1" so it's always going to be > doing this send queue state flipping for effectively everyone. > > Please find another way to solve this problem. > Crap-spackle, you're right. I was thinking the module_param defaulted it to zero, but it doesn't. Regardless, sysfs would let us change this value and deadlock it all to heck. Recinded. I'll find a better way to fix this. Sorry for the noise Neil -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c index c754d88..e5c8757 100644 --- a/drivers/net/3c59x.c +++ b/drivers/net/3c59x.c @@ -2338,7 +2338,17 @@ boomerang_interrupt(int irq, void *dev_id) /* * It seems dopey to put the spinlock this early, but we could race against vortex_tx_timeout * and boomerang_start_xmit + * We also need to disable the tx queue in the event that we print + * anything from this path. if pr_debug is called, we run the risk of + * recursing through the transmit path, which also takes the vp->lock, + * and deadlocks us. This only happens if we're using netpoll + * but we need to be ready for it + * We can mitigate the perf impact here if we only + * do this is vortex_debug is != 0 */ + if (vortex_debug) + netif_stop_queue(dev); + spin_lock(&vp->lock); status = ioread16(ioaddr + EL3_STATUS); @@ -2447,6 +2457,8 @@ boomerang_interrupt(int irq, void *dev_id) pr_debug("%s: exiting interrupt, status %4.4x.\n", dev->name, status); handler_exit: + if (vortex_debug) + netif_start_queue(dev); spin_unlock(&vp->lock); return IRQ_HANDLED; }
When using netpoll, its possible to deadlock the 3c59x driver. Since it takes an internal spinlock (vp->lock) that serializes boomerang_interrupt and parts of boomerang_start_xmit, if we call pr_debug in the former, we can go through the tx path on the same cpu, and recurse into the same driver again, deadlocking in the transmit routine. This patch fixes that problem by stopping the queues during interrupt processing, so that subsequent transmits will get queued until a later point. Its not a great solution, but we need to find some way to serialize access to the register file on the card without enforcing a deadlock. I think the queue stop is the best way to do that. And since we only print things in boomerang_interrupt when we have debug enabled, we can mitigate the impact of this change to only stop the queues when debug is on. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> --- drivers/net/3c59x.c | 12 ++++++++++++ 1 files changed, 12 insertions(+), 0 deletions(-)