Message ID | 1340281166.15484.16.camel@lb-tlvb-dmitry |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Thu, 2012-06-21 at 15:19 +0300, Dmitry Kravkov wrote: > The crash happens with default configuration since > [4acb41903b2f99f3dffd4c3df9acc84ca5942cb2] "net/tcp: Fix tcp memory > limits initialization when !CONFIG_SYSCTL", but it can be hit by > increasing values of tcp_wmem even earlier. This makes no sense. > From: Dmitry Kravkov <dmitry@broadcom.com> > Subject: [PATCH net-next] bnx2x: reservation for NEXT tx BDs > > Commit [4acb41903b2f99f3dffd4c3df9acc84ca5942cb2] > net/tcp: Fix tcp memory limits initialization when !CONFIG_SYSCTL > provided new default value for tcp_wmem, since heavy tcp > traffic may cause the TSO packet to consume 20 BDs + 1 for next page > descriptor. This is completely bogus. I have no idea how you came to this. A forwarding workload can trigger same bug, if GRO is enabled. Remove this wrong bit, please ? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2012-06-21 at 17:12 +0200, Eric Dumazet wrote: > On Thu, 2012-06-21 at 15:19 +0300, Dmitry Kravkov wrote: > > > The crash happens with default configuration since > > [4acb41903b2f99f3dffd4c3df9acc84ca5942cb2] "net/tcp: Fix tcp memory > > limits initialization when !CONFIG_SYSCTL", but it can be hit by > > increasing values of tcp_wmem even earlier. > > This makes no sense. Bisected to this commit and reproduced before the commit only after: echo "4096 16384 4194304" > /proc/sys/net/ipv4/tcp_wmem by this max nr_frags raised from 8 to 17, when running 40 netperfs i've decreased rx queue to 200, during the test > > From: Dmitry Kravkov <dmitry@broadcom.com> > > Subject: [PATCH net-next] bnx2x: reservation for NEXT tx BDs > > > > Commit [4acb41903b2f99f3dffd4c3df9acc84ca5942cb2] > > net/tcp: Fix tcp memory limits initialization when !CONFIG_SYSCTL > > provided new default value for tcp_wmem, since heavy tcp > > traffic may cause the TSO packet to consume 20 BDs + 1 for next page > > descriptor. > > This is completely bogus. I have no idea how you came to this. > > A forwarding workload can trigger same bug, if GRO is enabled. > > Remove this wrong bit, please ? > > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2012-06-21 at 18:56 +0300, Dmitry Kravkov wrote: > On Thu, 2012-06-21 at 17:12 +0200, Eric Dumazet wrote: > > On Thu, 2012-06-21 at 15:19 +0300, Dmitry Kravkov wrote: > > > > > The crash happens with default configuration since > > > [4acb41903b2f99f3dffd4c3df9acc84ca5942cb2] "net/tcp: Fix tcp memory > > > limits initialization when !CONFIG_SYSCTL", but it can be hit by > > > increasing values of tcp_wmem even earlier. > > > > This makes no sense. > Bisected to this commit and reproduced before the commit only after: > echo "4096 16384 4194304" > /proc/sys/net/ipv4/tcp_wmem > by this max nr_frags raised from 8 to 17, when running 40 netperfs > > i've decreased rx queue to 200, during the test sorry, tx queue > > > > > From: Dmitry Kravkov <dmitry@broadcom.com> > > > Subject: [PATCH net-next] bnx2x: reservation for NEXT tx BDs > > > > > > Commit [4acb41903b2f99f3dffd4c3df9acc84ca5942cb2] > > > net/tcp: Fix tcp memory limits initialization when !CONFIG_SYSCTL > > > provided new default value for tcp_wmem, since heavy tcp > > > traffic may cause the TSO packet to consume 20 BDs + 1 for next page > > > descriptor. > > > > This is completely bogus. I have no idea how you came to this. > > > > A forwarding workload can trigger same bug, if GRO is enabled. > > > > Remove this wrong bit, please ? > > > > > > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2012-06-21 at 18:56 +0300, Dmitry Kravkov wrote: > On Thu, 2012-06-21 at 17:12 +0200, Eric Dumazet wrote: > > On Thu, 2012-06-21 at 15:19 +0300, Dmitry Kravkov wrote: > > > > > The crash happens with default configuration since > > > [4acb41903b2f99f3dffd4c3df9acc84ca5942cb2] "net/tcp: Fix tcp memory > > > limits initialization when !CONFIG_SYSCTL", but it can be hit by > > > increasing values of tcp_wmem even earlier. > > > > This makes no sense. > Bisected to this commit and reproduced before the commit only after: > echo "4096 16384 4194304" > /proc/sys/net/ipv4/tcp_wmem > by this max nr_frags raised from 8 to 17, when running 40 netperfs > > i've decreased rx queue to 200, during the test I repeat, this bug can be triggered anytime with a skb not provided by local tcp stack. By the way, looking at your fix, its pretty obvious the fix has nothing to do with TCP stack. commit changelog must be accurate, so please remove this wrong information. This will confuse future readers. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2012-06-21 at 18:25 +0200, Eric Dumazet wrote: > On Thu, 2012-06-21 at 18:56 +0300, Dmitry Kravkov wrote: > > On Thu, 2012-06-21 at 17:12 +0200, Eric Dumazet wrote: > > > On Thu, 2012-06-21 at 15:19 +0300, Dmitry Kravkov wrote: > > > > > > > The crash happens with default configuration since > > > > [4acb41903b2f99f3dffd4c3df9acc84ca5942cb2] "net/tcp: Fix tcp memory > > > > limits initialization when !CONFIG_SYSCTL", but it can be hit by > > > > increasing values of tcp_wmem even earlier. > > > > > > This makes no sense. > > Bisected to this commit and reproduced before the commit only after: > > echo "4096 16384 4194304" > /proc/sys/net/ipv4/tcp_wmem > > by this max nr_frags raised from 8 to 17, when running 40 netperfs > > > > i've decreased rx queue to 200, during the test > > I repeat, this bug can be triggered anytime with a skb not provided by > local tcp stack. Got it > > By the way, looking at your fix, its pretty obvious the fix has nothing > to do with TCP stack. It just describes the "4" number > > commit changelog must be accurate, so please remove this wrong > information. This will confuse future readers. will do so > > > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h index 7de8241..be75f45 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h @@ -616,6 +616,21 @@ struct bnx2x_fastpath { #define TX_BD(x) ((x) & MAX_TX_BD) #define TX_BD_POFF(x) ((x) & MAX_TX_DESC_CNT) +/* how many NEXT PAGE descriptors may packet occupy */ +#define NEXT_CNT_PER_TX_PKT(bds) \ + (((bds) + MAX_TX_DESC_CNT - 1) / \ + MAX_TX_DESC_CNT * NEXT_PAGE_TX_DESC_CNT) +/* max pure data/headers BDs per tx packet: + * START_BD - describes packed + * START_BD(splitted) - includes unpaged data segment for GSO + * PARSING_BD - for TSO and CSUM data + * Frag BDs - decribes pages for frags + */ +#define MAX_BDS_PER_TX_PKT (MAX_SKB_FRAGS + 3) +/* max BDs per tx packet including next pages */ +#define MAX_DESC_PER_TX_PKT (MAX_BDS_PER_TX_PKT + \ + NEXT_CNT_PER_TX_PKT(MAX_BDS_PER_TX_PKT)) + /* The RX BD ring is special, each bd is 8 bytes but the last one is 16 */ #define NUM_RX_RINGS 8 #define RX_DESC_CNT (BCM_PAGE_SIZE / sizeof(struct eth_rx_bd)) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c index 8098eea..c4928ea 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c @@ -190,7 +190,7 @@ int bnx2x_tx_int(struct bnx2x *bp, struct bnx2x_fp_txdata *txdata) if ((netif_tx_queue_stopped(txq)) && (bp->state == BNX2X_STATE_OPEN) && - (bnx2x_tx_avail(bp, txdata) >= MAX_SKB_FRAGS + 4)) + (bnx2x_tx_avail(bp, txdata) >= MAX_DESC_PER_TX_PKT)) netif_tx_wake_queue(txq); __netif_tx_unlock(txq); @@ -2894,7 +2894,8 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev) txdata->cid, fp_index, txdata_index, txdata, fp); */ if (unlikely(bnx2x_tx_avail(bp, txdata) < - (skb_shinfo(skb)->nr_frags + 3))) { + skb_shinfo(skb)->nr_frags + 3 + + NEXT_CNT_PER_TX_PKT(MAX_BDS_PER_TX_PKT))) { fp->eth_q_stats.driver_xoff++; netif_tx_stop_queue(txq); BNX2X_ERR("BUG! Tx ring full when queue awake!\n"); @@ -3169,7 +3170,7 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev) txdata->tx_bd_prod += nbd; - if (unlikely(bnx2x_tx_avail(bp, txdata) < MAX_SKB_FRAGS + 4)) { + if (unlikely(bnx2x_tx_avail(bp, txdata) < MAX_DESC_PER_TX_PKT)) { netif_tx_stop_queue(txq); /* paired memory barrier is in bnx2x_tx_int(), we have to keep @@ -3178,7 +3179,7 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev) smp_mb(); fp->eth_q_stats.driver_xoff++; - if (bnx2x_tx_avail(bp, txdata) >= MAX_SKB_FRAGS + 4) + if (bnx2x_tx_avail(bp, txdata) >= MAX_DESC_PER_TX_PKT) netif_tx_wake_queue(txq); } txdata->tx_pkt++;