Message ID | 1408976542-15624-1-git-send-email-jonas.jensen@gmail.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
From: Jonas Jensen <jonas.jensen@gmail.com> Date: Mon, 25 Aug 2014 16:22:22 +0200 > build_skb() is used to make skbs out of existing RX ring memory > which is bad because the RX ring is allocated only once, on probe. > Memory corruption occur because said memory is reclaimed, i.e. > __kfree_skb() (and eventually put_page()). > > Replace build_skb() with netdev_alloc_skb_ip_align() and use memcpy(). > > Remove SKB_DATA_ALIGN() from RX buffer size while we're at it. > > Addresses https://bugzilla.kernel.org/show_bug.cgi?id=69041 > > Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com> Applied. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Monday 25 August 2014 16:22:22 Jonas Jensen wrote: > @@ -226,13 +226,15 @@ static int moxart_rx_poll(struct napi_struct *napi, int budget) > if (len > RX_BUF_SIZE) > len = RX_BUF_SIZE; > > - skb = build_skb(priv->rx_buf[rx_head], priv->rx_buf_size); > + skb = netdev_alloc_skb_ip_align(ndev, len); > + > if (unlikely(!skb)) { > - net_dbg_ratelimited("build_skb failed\n"); > + net_dbg_ratelimited("netdev_alloc_skb_ip_align failed\n"); > priv->stats.rx_dropped++; > priv->stats.rx_errors++; > } > > + memcpy(skb->data, priv->rx_buf[rx_head], len); > skb_put(skb, len); > skb->protocol = eth_type_trans(skb, ndev); > napi_gro_receive(&priv->napi, skb); While this seems correct, I wonder why you don't do the normal approach of dequeuing the skb from the chain and adding a newly allocated skb to it to save the memcpy. Arnd -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Arnd Bergmann > On Monday 25 August 2014 16:22:22 Jonas Jensen wrote: > > @@ -226,13 +226,15 @@ static int moxart_rx_poll(struct napi_struct *napi, int budget) > > if (len > RX_BUF_SIZE) > > len = RX_BUF_SIZE; > > > > - skb = build_skb(priv->rx_buf[rx_head], priv->rx_buf_size); > > + skb = netdev_alloc_skb_ip_align(ndev, len); > > + > > if (unlikely(!skb)) { > > - net_dbg_ratelimited("build_skb failed\n"); > > + net_dbg_ratelimited("netdev_alloc_skb_ip_align failed\n"); > > priv->stats.rx_dropped++; > > priv->stats.rx_errors++; > > } > > > > + memcpy(skb->data, priv->rx_buf[rx_head], len); Is this memcpy() aligned? If the hardware can receive to a 4n+2 offset it is probably better to copy the two bytes before the frame data and to round the length of to a whole number of words. > > skb_put(skb, len); > > skb->protocol = eth_type_trans(skb, ndev); > > napi_gro_receive(&priv->napi, skb); > > While this seems correct, I wonder why you don't do the normal approach of > dequeuing the skb from the chain and adding a newly allocated skb to it to > save the memcpy. Because the receive buffer area isn't made of skbs. Post-allocating the skb also reduces the 'true size' of the skb. David -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2014-08-26 at 09:10 +0000, David Laight wrote: > From: Arnd Bergmann > > While this seems correct, I wonder why you don't do the normal approach of > > dequeuing the skb from the chain and adding a newly allocated skb to it to > > save the memcpy. > > Because the receive buffer area isn't made of skbs. > Post-allocating the skb also reduces the 'true size' of the skb. This strategy assumes this is not a 10Gbe NIC. We try to avoid copies because they are generally not needed. Wifi devices are usually slow, and packet losses are more frequent, so the copybreak gives better chance to not doing the collapses [1] later in the TCP stack. [1] collapses : reducing skb overhead (skb->len / skb->truesize ratio) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/ethernet/moxa/moxart_ether.c b/drivers/net/ethernet/moxa/moxart_ether.c index eed70d9..d66058d 100644 --- a/drivers/net/ethernet/moxa/moxart_ether.c +++ b/drivers/net/ethernet/moxa/moxart_ether.c @@ -226,13 +226,15 @@ static int moxart_rx_poll(struct napi_struct *napi, int budget) if (len > RX_BUF_SIZE) len = RX_BUF_SIZE; - skb = build_skb(priv->rx_buf[rx_head], priv->rx_buf_size); + skb = netdev_alloc_skb_ip_align(ndev, len); + if (unlikely(!skb)) { - net_dbg_ratelimited("build_skb failed\n"); + net_dbg_ratelimited("netdev_alloc_skb_ip_align failed\n"); priv->stats.rx_dropped++; priv->stats.rx_errors++; } + memcpy(skb->data, priv->rx_buf[rx_head], len); skb_put(skb, len); skb->protocol = eth_type_trans(skb, ndev); napi_gro_receive(&priv->napi, skb); @@ -464,8 +466,7 @@ static int moxart_mac_probe(struct platform_device *pdev) spin_lock_init(&priv->txlock); priv->tx_buf_size = TX_BUF_SIZE; - priv->rx_buf_size = RX_BUF_SIZE + - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); + priv->rx_buf_size = RX_BUF_SIZE; priv->tx_desc_base = dma_alloc_coherent(NULL, TX_REG_DESC_SIZE * TX_DESC_NUM, &priv->tx_base,
build_skb() is used to make skbs out of existing RX ring memory which is bad because the RX ring is allocated only once, on probe. Memory corruption occur because said memory is reclaimed, i.e. __kfree_skb() (and eventually put_page()). Replace build_skb() with netdev_alloc_skb_ip_align() and use memcpy(). Remove SKB_DATA_ALIGN() from RX buffer size while we're at it. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=69041 Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com> --- Notes: Changes since v5: 1. broke out DMA synchronization to separate patch Applies to next-20140825 drivers/net/ethernet/moxa/moxart_ether.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)