Message ID | 20090407183623.7545bb0b@leela |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
From: Michal Schmidt <mschmidt@redhat.com> Date: Tue, 7 Apr 2009 18:36:23 +0200 > The BUG_ON(skge->tx_ring.to_use != skge->tx_ring.to_clean) in skge_up() > was sometimes observed when setting MTU. > > skge_down() disables the TX queue, but then reenables it by mistake via > skge_tx_clean(). > Fix it by moving the waking of the queue from skge_tx_clean() to the > other caller. And to make sure start_xmit is not in progress on another > CPU, skge_down() should call netif_tx_disable(). > > The bug was reported to me by Jiri Jilek whose Debian system sometimes > failed to boot. He tested the patch and the bug did not happen anymore. > > Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Stephen, an ACK possibly? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 08 Apr 2009 16:01:52 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > From: Michal Schmidt <mschmidt@redhat.com> > Date: Tue, 7 Apr 2009 18:36:23 +0200 > > > The BUG_ON(skge->tx_ring.to_use != skge->tx_ring.to_clean) in skge_up() > > was sometimes observed when setting MTU. > > > > skge_down() disables the TX queue, but then reenables it by mistake via > > skge_tx_clean(). > > Fix it by moving the waking of the queue from skge_tx_clean() to the > > other caller. And to make sure start_xmit is not in progress on another > > CPU, skge_down() should call netif_tx_disable(). > > > > The bug was reported to me by Jiri Jilek whose Debian system sometimes > > failed to boot. He tested the patch and the bug did not happen anymore. > > > > Signed-off-by: Michal Schmidt <mschmidt@redhat.com> > > Stephen, an ACK possibly? I wanted to test on real hardware, and am offsite this week. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Stephen Hemminger <shemminger@vyatta.com> Date: Wed, 8 Apr 2009 16:06:21 -0700 > On Wed, 08 Apr 2009 16:01:52 -0700 (PDT) > David Miller <davem@davemloft.net> wrote: > >> From: Michal Schmidt <mschmidt@redhat.com> >> Date: Tue, 7 Apr 2009 18:36:23 +0200 >> >> > The BUG_ON(skge->tx_ring.to_use != skge->tx_ring.to_clean) in skge_up() >> > was sometimes observed when setting MTU. >> > >> > skge_down() disables the TX queue, but then reenables it by mistake via >> > skge_tx_clean(). >> > Fix it by moving the waking of the queue from skge_tx_clean() to the >> > other caller. And to make sure start_xmit is not in progress on another >> > CPU, skge_down() should call netif_tx_disable(). >> > >> > The bug was reported to me by Jiri Jilek whose Debian system sometimes >> > failed to boot. He tested the patch and the bug did not happen anymore. >> > >> > Signed-off-by: Michal Schmidt <mschmidt@redhat.com> >> >> Stephen, an ACK possibly? > > I wanted to test on real hardware, and am offsite this week. Ok, I'll wait for that, thanks! -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 7 Apr 2009 18:36:23 +0200 Michal Schmidt <mschmidt@redhat.com> wrote: > The BUG_ON(skge->tx_ring.to_use != skge->tx_ring.to_clean) in skge_up() > was sometimes observed when setting MTU. > > skge_down() disables the TX queue, but then reenables it by mistake via > skge_tx_clean(). > Fix it by moving the waking of the queue from skge_tx_clean() to the > other caller. And to make sure start_xmit is not in progress on another > CPU, skge_down() should call netif_tx_disable(). > > The bug was reported to me by Jiri Jilek whose Debian system sometimes > failed to boot. He tested the patch and the bug did not happen anymore. It's conventional to add the reporter's "Reported-by:" tag to the changelog in this situation. > Signed-off-by: Michal Schmidt <mschmidt@redhat.com> As the bug is present in 2.6.29 (and possibly earlier?) it's appropriate to add a Cc: <stable@kernel.org> too. This makes davem go mad at you, but I prefer getting madded at over possibly losing bugfixes ;) > > diff --git a/drivers/net/skge.c b/drivers/net/skge.c > index 952d37f..b2a05af 100644 > --- a/drivers/net/skge.c > +++ b/drivers/net/skge.c > @@ -2674,7 +2674,7 @@ static int skge_down(struct net_device *dev) > if (netif_msg_ifdown(skge)) > printk(KERN_INFO PFX "%s: disabling interface\n", dev->name); > > - netif_stop_queue(dev); > + netif_tx_disable(dev); > > if (hw->chip_id == CHIP_ID_GENESIS && hw->phy_type == SK_PHY_XMAC) > del_timer_sync(&skge->link_timer); > @@ -2881,7 +2881,6 @@ static void skge_tx_clean(struct net_device *dev) > } > > skge->tx_ring.to_clean = e; > - netif_wake_queue(dev); > } > > static void skge_tx_timeout(struct net_device *dev) > @@ -2893,6 +2892,7 @@ static void skge_tx_timeout(struct net_device *dev) > > skge_write8(skge->hw, Q_ADDR(txqaddr[skge->port], Q_CSR), CSR_STOP); > skge_tx_clean(dev); > + netif_wake_queue(dev); > } > > static int skge_change_mtu(struct net_device *dev, int new_mtu) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Michal Schmidt <mschmidt@redhat.com> Date: Tue, 7 Apr 2009 18:36:23 +0200 > The BUG_ON(skge->tx_ring.to_use != skge->tx_ring.to_clean) in skge_up() > was sometimes observed when setting MTU. > > skge_down() disables the TX queue, but then reenables it by mistake via > skge_tx_clean(). > Fix it by moving the waking of the queue from skge_tx_clean() to the > other caller. And to make sure start_xmit is not in progress on another > CPU, skge_down() should call netif_tx_disable(). > > The bug was reported to me by Jiri Jilek whose Debian system sometimes > failed to boot. He tested the patch and the bug did not happen anymore. > > Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Stephen have you had a chance to test this yet? Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 7 Apr 2009 18:36:23 +0200 Michal Schmidt <mschmidt@redhat.com> wrote: > The BUG_ON(skge->tx_ring.to_use != skge->tx_ring.to_clean) in skge_up() > was sometimes observed when setting MTU. > > skge_down() disables the TX queue, but then reenables it by mistake via > skge_tx_clean(). > Fix it by moving the waking of the queue from skge_tx_clean() to the > other caller. And to make sure start_xmit is not in progress on another > CPU, skge_down() should call netif_tx_disable(). > > The bug was reported to me by Jiri Jilek whose Debian system sometimes > failed to boot. He tested the patch and the bug did not happen anymore. > > Signed-off-by: Michal Schmidt <mschmidt@redhat.com> > --- > drivers/net/skge.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) Tested fine. This should go to stable as well. Acked-by: Stephen Hemminger <shemminger@vyatta.com> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Stephen Hemminger <shemminger@vyatta.com> Date: Tue, 14 Apr 2009 10:55:39 -0700 > On Tue, 7 Apr 2009 18:36:23 +0200 > Michal Schmidt <mschmidt@redhat.com> wrote: > >> The BUG_ON(skge->tx_ring.to_use != skge->tx_ring.to_clean) in skge_up() >> was sometimes observed when setting MTU. >> >> skge_down() disables the TX queue, but then reenables it by mistake via >> skge_tx_clean(). >> Fix it by moving the waking of the queue from skge_tx_clean() to the >> other caller. And to make sure start_xmit is not in progress on another >> CPU, skge_down() should call netif_tx_disable(). >> >> The bug was reported to me by Jiri Jilek whose Debian system sometimes >> failed to boot. He tested the patch and the bug did not happen anymore. >> >> Signed-off-by: Michal Schmidt <mschmidt@redhat.com> >> --- >> drivers/net/skge.c | 4 ++-- >> 1 files changed, 2 insertions(+), 2 deletions(-) > > Tested fine. This should go to stable as well. > > Acked-by: Stephen Hemminger <shemminger@vyatta.com> Applied, thanks everyone. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/skge.c b/drivers/net/skge.c index 952d37f..b2a05af 100644 --- a/drivers/net/skge.c +++ b/drivers/net/skge.c @@ -2674,7 +2674,7 @@ static int skge_down(struct net_device *dev) if (netif_msg_ifdown(skge)) printk(KERN_INFO PFX "%s: disabling interface\n", dev->name); - netif_stop_queue(dev); + netif_tx_disable(dev); if (hw->chip_id == CHIP_ID_GENESIS && hw->phy_type == SK_PHY_XMAC) del_timer_sync(&skge->link_timer); @@ -2881,7 +2881,6 @@ static void skge_tx_clean(struct net_device *dev) } skge->tx_ring.to_clean = e; - netif_wake_queue(dev); } static void skge_tx_timeout(struct net_device *dev) @@ -2893,6 +2892,7 @@ static void skge_tx_timeout(struct net_device *dev) skge_write8(skge->hw, Q_ADDR(txqaddr[skge->port], Q_CSR), CSR_STOP); skge_tx_clean(dev); + netif_wake_queue(dev); } static int skge_change_mtu(struct net_device *dev, int new_mtu)
The BUG_ON(skge->tx_ring.to_use != skge->tx_ring.to_clean) in skge_up() was sometimes observed when setting MTU. skge_down() disables the TX queue, but then reenables it by mistake via skge_tx_clean(). Fix it by moving the waking of the queue from skge_tx_clean() to the other caller. And to make sure start_xmit is not in progress on another CPU, skge_down() should call netif_tx_disable(). The bug was reported to me by Jiri Jilek whose Debian system sometimes failed to boot. He tested the patch and the bug did not happen anymore. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> --- drivers/net/skge.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-)