Message ID | 25a379578b71bf01f3c77ac76a193d26554f9e0c.1308555865.git.richard.cochran@omicron.at |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
On Mon, 2011-06-20 at 09:48 +0200, Richard Cochran wrote: Hi Richard, > Because the socket buffer is freed in the completion interrupt, it is not > safe to access it after submitting it to the hardware. I don't see why. skb is freed from txq_reclaim() which grabs the tx queue lock before, (hence the lockless __skb_queue_xxx() in both functions) What am I missing ? > Cc: stable@kernel.org > Cc: Lennert Buytenhek <buytenh@wantstofly.org> > Signed-off-by: Richard Cochran <richard.cochran@omicron.at> > --- > drivers/net/mv643xx_eth.c | 6 ++++-- > 1 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/mv643xx_eth.c b/drivers/net/mv643xx_eth.c > index a5d9b1c..1b7d2c1 100644 > --- a/drivers/net/mv643xx_eth.c > +++ b/drivers/net/mv643xx_eth.c > @@ -859,7 +859,7 @@ no_csum: > static netdev_tx_t mv643xx_eth_xmit(struct sk_buff *skb, struct net_device *dev) > { > struct mv643xx_eth_private *mp = netdev_priv(dev); > - int queue; > + int length, queue; > struct tx_queue *txq; > struct netdev_queue *nq; > > @@ -881,10 +881,12 @@ static netdev_tx_t mv643xx_eth_xmit(struct sk_buff *skb, struct net_device *dev) > return NETDEV_TX_OK; > } > > + length = skb->len; > + > if (!txq_submit_skb(txq, skb)) { > int entries_left; > > - txq->tx_bytes += skb->len; > + txq->tx_bytes += length; > txq->tx_packets++; > > entries_left = txq->tx_ring_size - txq->tx_desc_count;
Le lundi 20 juin 2011 à 09:48 +0200, Richard Cochran a écrit : > Because the socket buffer is freed in the completion interrupt, it is not > safe to access it after submitting it to the hardware. > > Cc: stable@kernel.org > Cc: Lennert Buytenhek <buytenh@wantstofly.org> > Signed-off-by: Richard Cochran <richard.cochran@omicron.at> > --- > drivers/net/mv643xx_eth.c | 6 ++++-- > 1 files changed, 4 insertions(+), 2 deletions(-) > Acked-by: Eric Dumazet <eric.dumazet@gmail.com> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le lundi 20 juin 2011 à 16:07 +0200, Maxime Bizon a écrit : > On Mon, 2011-06-20 at 09:48 +0200, Richard Cochran wrote: > > Hi Richard, > > > Because the socket buffer is freed in the completion interrupt, it is not > > safe to access it after submitting it to the hardware. > > I don't see why. > > skb is freed from txq_reclaim() which grabs the tx queue lock before, > (hence the lockless __skb_queue_xxx() in both functions) > > What am I missing ? You are right, this driver still takes tx queue lock in its TX completion path. txq_reclaim() can currently run for a very long time, blocking other cpus to make progress if blocked on tx queue lock. So consider this patch as a cleanup : Its better to make all drivers behave the same way 1) Either increment tx stats in their start_xmit(), and making sure they dont access skb->len too late. 2) increment tx stats in their TX completion path. Then we can work on making TX completion path not taking tx queue lock in its fast path. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jun 20, 2011 at 09:48:07AM +0200, Richard Cochran wrote: > Because the socket buffer is freed in the completion interrupt, it > is not safe to access it after submitting it to the hardware. Maybe I'm missing something here, but mv643xx_eth TX reclaim is done from NAPI poll, under __netif_tx_lock(), while mv643xx_eth_xmit() also runs under __netif_tx_lock(). -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le lundi 20 juin 2011 à 18:33 +0200, Lennert Buytenhek a écrit : > On Mon, Jun 20, 2011 at 09:48:07AM +0200, Richard Cochran wrote: > > > Because the socket buffer is freed in the completion interrupt, it > > is not safe to access it after submitting it to the hardware. > > Maybe I'm missing something here, but mv643xx_eth TX reclaim is done > from NAPI poll, under __netif_tx_lock(), while mv643xx_eth_xmit() also > runs under __netif_tx_lock(). See my previous answer. Its true this driver _currently_ holds tx queue lock in its TX completion. But that might/should change. Goal is to make tx completion not use tx queue lock in fast path, like its done in tg3, bnx2, bnx2x ... and other recent drivers. Its obviously correct to move skb->len access in start_xmit() before starting the IO, even if not a bug fix, it makes all drivers behave the same : When reviewing them, its easier not to worry about these possible use after free. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le lundi 20 juin 2011 à 19:19 +0200, Eric Dumazet a écrit : > Le lundi 20 juin 2011 à 18:33 +0200, Lennert Buytenhek a écrit : > > On Mon, Jun 20, 2011 at 09:48:07AM +0200, Richard Cochran wrote: > > > > > Because the socket buffer is freed in the completion interrupt, it > > > is not safe to access it after submitting it to the hardware. > > > > Maybe I'm missing something here, but mv643xx_eth TX reclaim is done > > from NAPI poll, under __netif_tx_lock(), while mv643xx_eth_xmit() also > > runs under __netif_tx_lock(). > > See my previous answer. Its true this driver _currently_ holds tx queue > lock in its TX completion. But that might/should change. > > Goal is to make tx completion not use tx queue lock in fast path, like > its done in tg3, bnx2, bnx2x ... and other recent drivers. > > Its obviously correct to move skb->len access in start_xmit() before > starting the IO, even if not a bug fix, it makes all drivers behave the > same : When reviewing them, its easier not to worry about these possible > use after free. > > One random example found in drivers/staging/hv/netvsc_drv.c Is the "net->stats.tx_bytes += skb->len;" found in line 188 is correct ? Who knows ? I believe its not correct ;) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Richard Cochran <richardcochran@gmail.com> Date: Mon, 20 Jun 2011 09:48:07 +0200 > Because the socket buffer is freed in the completion interrupt, it is not > safe to access it after submitting it to the hardware. > > Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Since this isn't actually a bonafide bug fix I've applied this to net-next-2.6 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/mv643xx_eth.c b/drivers/net/mv643xx_eth.c index a5d9b1c..1b7d2c1 100644 --- a/drivers/net/mv643xx_eth.c +++ b/drivers/net/mv643xx_eth.c @@ -859,7 +859,7 @@ no_csum: static netdev_tx_t mv643xx_eth_xmit(struct sk_buff *skb, struct net_device *dev) { struct mv643xx_eth_private *mp = netdev_priv(dev); - int queue; + int length, queue; struct tx_queue *txq; struct netdev_queue *nq; @@ -881,10 +881,12 @@ static netdev_tx_t mv643xx_eth_xmit(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_OK; } + length = skb->len; + if (!txq_submit_skb(txq, skb)) { int entries_left; - txq->tx_bytes += skb->len; + txq->tx_bytes += length; txq->tx_packets++; entries_left = txq->tx_ring_size - txq->tx_desc_count;
Because the socket buffer is freed in the completion interrupt, it is not safe to access it after submitting it to the hardware. Cc: stable@kernel.org Cc: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: Richard Cochran <richard.cochran@omicron.at> --- drivers/net/mv643xx_eth.c | 6 ++++-- 1 files changed, 4 insertions(+), 2 deletions(-)