diff mbox

[v1,4/6] net: fec: Increase buffer descriptor entry number

Message ID 1401415552-2263-5-git-send-email-b38611@freescale.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Nimrod Andy May 30, 2014, 2:05 a.m. UTC
In order to support SG, software TSO, let's increase BD entry number.

Signed-off-by: Fugang Duan <B38611@freescale.com>
---
 drivers/net/ethernet/freescale/fec.h      |    6 +++---
 drivers/net/ethernet/freescale/fec_main.c |   21 ++++++++++++---------
 2 files changed, 15 insertions(+), 12 deletions(-)

Comments

David Laight May 30, 2014, 9:10 a.m. UTC | #1
From: Fugang Duan
> In order to support SG, software TSO, let's increase BD entry number.

Software TSO shouldn't significantly increase the number of
descriptors required.
I'd guess it makes a lot of packets have 2 fragments.

...
> -#define FEC_ENET_RX_PAGES	8
> +#define FEC_ENET_RX_PAGES	256

Supporting TSO shouldn't need more RX descriptors.
While 16 descriptors isn't that many, 512 is a lot.
If PAGE_SIZE is greater than 4k it is a huge amount of memory.

>  #define FEC_ENET_RX_FRSIZE	2048
>  #define FEC_ENET_RX_FRPPG	(PAGE_SIZE / FEC_ENET_RX_FRSIZE)
>  #define RX_RING_SIZE		(FEC_ENET_RX_FRPPG * FEC_ENET_RX_PAGES)
>  #define FEC_ENET_TX_FRSIZE	2048
>  #define FEC_ENET_TX_FRPPG	(PAGE_SIZE / FEC_ENET_TX_FRSIZE)
> -#define TX_RING_SIZE		16	/* Must be power of two */
> -#define TX_RING_MOD_MASK	15	/*   for this to work */
> +#define TX_RING_SIZE		512	/* Must be power of two */
> +#define TX_RING_MOD_MASK	511	/*   for this to work */

Does the driver support BQL (or similar) in order to limit
the amount of queued tx traffic?
Otherwise you've significantly increased the latency for
connections other than one doing bulk tx.

	David



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Fugang Duan May 30, 2014, 9:42 a.m. UTC | #2
From: David Laight <David.Laight@ACULAB.COM> Data: Friday, May 30, 2014 5:11 PM
>To: Duan Fugang-B38611; Li Frank-B20596; davem@davemloft.net
>Cc: ezequiel.garcia@free-electrons.com; netdev@vger.kernel.org;
>shawn.guo@linaro.org; bhutchings@solarflare.com;
>stephen@networkplumber.org
>Subject: RE: [PATCH v1 4/6] net: fec: Increase buffer descriptor entry
>number
>
>From: Fugang Duan
>> In order to support SG, software TSO, let's increase BD entry number.
>
>Software TSO shouldn't significantly increase the number of descriptors
>required.
>I'd guess it makes a lot of packets have 2 fragments.
>
>...
>> -#define FEC_ENET_RX_PAGES	8
>> +#define FEC_ENET_RX_PAGES	256
>
>Supporting TSO shouldn't need more RX descriptors.
>While 16 descriptors isn't that many, 512 is a lot.
>If PAGE_SIZE is greater than 4k it is a huge amount of memory.
>
Yes, 16 descriptors is little, increase BD entry size can improve the performance.
And I want to add the later patch to support interrupt coalescing, which need more BD descriptors.

>>  #define FEC_ENET_RX_FRSIZE	2048
>>  #define FEC_ENET_RX_FRPPG	(PAGE_SIZE / FEC_ENET_RX_FRSIZE)
>>  #define RX_RING_SIZE		(FEC_ENET_RX_FRPPG * FEC_ENET_RX_PAGES)
>>  #define FEC_ENET_TX_FRSIZE	2048
>>  #define FEC_ENET_TX_FRPPG	(PAGE_SIZE / FEC_ENET_TX_FRSIZE)
>> -#define TX_RING_SIZE		16	/* Must be power of two */
>> -#define TX_RING_MOD_MASK	15	/*   for this to work */
>> +#define TX_RING_SIZE		512	/* Must be power of two */
>> +#define TX_RING_MOD_MASK	511	/*   for this to work */
>
>Does the driver support BQL (or similar) in order to limit the amount of
>queued tx traffic?
>Otherwise you've significantly increased the latency for connections other
>than one doing bulk tx.
>
>	David
>
The driver still don't support BQL.
I will add the feature to support FEC. Thanks for your advise.

Thanks,
Andy
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet May 30, 2014, 1:13 p.m. UTC | #3
On Fri, 2014-05-30 at 09:42 +0000, fugang.duan@freescale.com wrote:
> From: David Laight <David.Laight@ACULAB.COM> Data: Friday, May 30, 2014 5:11 PM
> >
> >Does the driver support BQL (or similar) in order to limit the amount of
> >queued tx traffic?
> >Otherwise you've significantly increased the latency for connections other
> >than one doing bulk tx.
> >
> >	David
> >
> The driver still don't support BQL.
> I will add the feature to support FEC. Thanks for your advise.

Note that a full size TSO packet (44 or 45 MSS) requires about 88 or 90
descriptors.

So I do not think BQL is really needed, because a 512 slots TX ring wont
add a big latency : About 5 ms max.

BQL is generally nice for NIC supporting hardware TSO, where a 64KB TSO
packet consumes 3 or 4 descriptors.

Also note that TCP Small Queues should limit TX ring occupancy of a
single bulk flow anyway.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ezequiel Garcia May 30, 2014, 2:01 p.m. UTC | #4
Hello Eric,

On 30 May 06:13 AM, Eric Dumazet wrote:
> On Fri, 2014-05-30 at 09:42 +0000, fugang.duan@freescale.com wrote:
> > From: David Laight <David.Laight@ACULAB.COM> Data: Friday, May 30, 2014 5:11 PM
> > >
> > >Does the driver support BQL (or similar) in order to limit the amount of
> > >queued tx traffic?
> > >Otherwise you've significantly increased the latency for connections other
> > >than one doing bulk tx.
> > >
> > >	David
> > >
> > The driver still don't support BQL.
> > I will add the feature to support FEC. Thanks for your advise.
> 
> Note that a full size TSO packet (44 or 45 MSS) requires about 88 or 90
> descriptors.
> 

What's the rationale behing those numbers?

Following the Solarflare commit you pointed me a while ago, I've set
gso_max_segs to limit the number of descriptors in the mvneta and
mv643xx_eth drivers.

My understanding reading the sfc commit 7e6d06f0de3f7 is that the
maximum number of required descriptors would be around ~250:

/* Header and payload descriptor for each output segment, plus
 * one for every input fragment boundary within a segment
 */
unsigned int max_descs = EFX_TSO_MAX_SEGS * 2 + MAX_SKB_FRAGS;

Maybe I've misunderstood something?

PS: I was also about to add BQL support to the drivers...
Eric Dumazet May 30, 2014, 2:54 p.m. UTC | #5
On Fri, 2014-05-30 at 11:01 -0300, ezequiel.garcia@free-electrons.com
wrote:
> Hello Eric,
> 
> On 30 May 06:13 AM, Eric Dumazet wrote:
> > On Fri, 2014-05-30 at 09:42 +0000, fugang.duan@freescale.com wrote:
> > > From: David Laight <David.Laight@ACULAB.COM> Data: Friday, May 30, 2014 5:11 PM
> > > >
> > > >Does the driver support BQL (or similar) in order to limit the amount of
> > > >queued tx traffic?
> > > >Otherwise you've significantly increased the latency for connections other
> > > >than one doing bulk tx.
> > > >
> > > >	David
> > > >
> > > The driver still don't support BQL.
> > > I will add the feature to support FEC. Thanks for your advise.
> > 
> > Note that a full size TSO packet (44 or 45 MSS) requires about 88 or 90
> > descriptors.
> > 
> 
> What's the rationale behing those numbers?
> 

64KB TSO packet, with MSS=1460 -> 44 segments  (44*1460 = 64240)
with MSS=1448 (TCP timestamps) -> 45 segments  (45*1448 = 65160)

This software TSO emulation uses at least 2 descriptors per MSS

one descriptor to hold the headers (ethernet + ip + tcp)
one descriptor (or two) to hold the payload for this MSS


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet May 30, 2014, 3:40 p.m. UTC | #6
On Fri, 2014-05-30 at 15:08 +0000, fugang.duan@freescale.com wrote:

> If frag page data is not match the alignment for ethernet DMA controller, there need three descriptor for one MSS:
> One descriptor for headers, one for the first non-align bytes copied from frag page, one for the rest of frag page.
> 

You could avoid the 2nd descriptor, by copying the unaligned part of the
payload into first descriptor (containing headers : about 66 bytes or so
you have room, if not, increase the 128 bytes room to 192 bytes)

> So one frame may cost descriptor number is: 3 x 45 

May... but in general it would be closer of 2 * 45


> 
> So the descriptors slots set to 512 is not big, just is reasonable. Do you think ?

I believe it is fine : that is about 5 64KB TSO packets.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Fugang Duan May 30, 2014, 3:46 p.m. UTC | #7
From: Eric Dumazet <eric.dumazet@gmail.com> Data: Friday, May 30, 2014 11:41 PM

>To: Duan Fugang-B38611

>Cc: ezequiel.garcia@free-electrons.com; David Laight; Li Frank-B20596;

>davem@davemloft.net; netdev@vger.kernel.org; shawn.guo@linaro.org;

>bhutchings@solarflare.com; stephen@networkplumber.org

>Subject: RE: [PATCH v1 4/6] net: fec: Increase buffer descriptor entry

>number

>

>On Fri, 2014-05-30 at 15:08 +0000, fugang.duan@freescale.com wrote:

>

>> If frag page data is not match the alignment for ethernet DMA controller,

>there need three descriptor for one MSS:

>> One descriptor for headers, one for the first non-align bytes copied

>from frag page, one for the rest of frag page.

>>

>

>You could avoid the 2nd descriptor, by copying the unaligned part of the

>payload into first descriptor (containing headers : about 66 bytes or so

>you have room, if not, increase the 128 bytes room to 192 bytes)

>

>> So one frame may cost descriptor number is: 3 x 45

>

>May... but in general it would be closer of 2 * 45

>

Good idea!

>

>>

>> So the descriptors slots set to 512 is not big, just is reasonable. Do

>you think ?

>

>I believe it is fine : that is about 5 64KB TSO packets.

>


Thanks,
Andy
Fugang Duan May 30, 2014, 3:52 p.m. UTC | #8
Hi, David

From: David Laight <David.Laight@ACULAB.COM> Data: Friday, May 30, 2014 11:35 PM

>To: Duan Fugang-B38611; Eric Dumazet; ezequiel.garcia@free-electrons.com

>Cc: Li Frank-B20596; davem@davemloft.net; netdev@vger.kernel.org;

>shawn.guo@linaro.org; bhutchings@solarflare.com;

>stephen@networkplumber.org

>Subject: RE: [PATCH v1 4/6] net: fec: Increase buffer descriptor entry

>number

>

>From: fugang.duan@freescale.com

>> >64KB TSO packet, with MSS=1460 -> 44 segments  (44*1460 = 64240) with

>> >MSS=1448 (TCP timestamps) -> 45 segments  (45*1448 = 65160)

>> >

>> >This software TSO emulation uses at least 2 descriptors per MSS

>> >

>> >one descriptor to hold the headers (ethernet + ip + tcp) one

>> >descriptor (or two) to hold the payload for this MSS

>> >

>> Thanks for Eric's detail explain.

>>

>> If frag page data is not match the alignment for ethernet DMA

>> controller, there need three descriptor for one MSS:

>> One descriptor for headers, one for the first non-align bytes copied

>> from frag page, one for the rest of frag page.

>>

>> So one frame may cost descriptor number is: 3 x 45

>

>No - that is 45 frames, typically needing 3 ring entries each.

>

>> And I will add interrupt coalescing support for tx and rx, which also

>cost some more descriptors.

>>

>> So the descriptors slots set to 512 is not big, just is reasonable. Do

>you think ?

>

>Software TSO generates lots of separate ethernet frames, there is no

>absolute requirement to be able to put all of them into the tx ring at

>once.

>

>The required size for the tx ring is much more likely to be related to any

>interrupt mitigation that delays the refilling of ring entries.

>512 sounds like a lot of tx ring entries.

>

>The receive ring doesn't need to allow for fragments - your driver is in

>control of allocating the buffers.

>I didn't understand why you've based the number of rx ring entries on

>PAGE_SIZE - IIRC that might be 64k, or even larger.

>I'd have expected 128 or 256 rx ring entries to be typical, but how many

>are needed is a separate issue.

>If the system can keep up with the maximum ethernet data rate I'd expect

>that a smaller number would be fine.

>If it can't keep up you'll lose packets anyway.

>Aggressive power saving (with wake on LAN) might need more.

>

>	David


Thanks for your information. I will do more test to weigh the smaller number for maximum data.
Summary you and Eric's thoughts, maybe rx set to 256, tx set to 512 are fine.

Thanks,
Andy
Eric Dumazet May 30, 2014, 3:58 p.m. UTC | #9
On Fri, 2014-05-30 at 15:34 +0000, David Laight wrote:

> Software TSO generates lots of separate ethernet frames, there is no
> absolute requirement to be able to put all of them into the tx ring at once.

Yes there is absolute requirement. It is driver responsibility to block
the queue if the available slots in TX ring would not allow the
following packet to be sent.

Thats why a TSO emulation needs to set gso_max_segs to some sane value

(sfc sets it to 100)

In Fugang patch this part was a wrong way to deal with this :

+       if (tso_count_descs(skb) >= fec_enet_txdesc_entry_free(fep)) {
+               if (net_ratelimit())
+                       netdev_err(ndev, "tx queue full!\n");
+               return NETDEV_TX_BUSY;
+       }
+

This was copy/pasted from buggy
drivers/net/ethernet/marvell/mv643xx_eth.c

> 
> The required size for the tx ring is much more likely to be related
> to any interrupt mitigation that delays the refilling of ring entries.
> 512 sounds like a lot of tx ring entries.

512 slots -> 256 frames (considering SG : headers/payload use 2 desc)

-> 3 ms on a Gbit NIC. Its about the right size.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ezequiel Garcia May 30, 2014, 4:35 p.m. UTC | #10
On 30 May 08:58 AM, Eric Dumazet wrote:
> On Fri, 2014-05-30 at 15:34 +0000, David Laight wrote:
> 
> > Software TSO generates lots of separate ethernet frames, there is no
> > absolute requirement to be able to put all of them into the tx ring at once.
> 
> Yes there is absolute requirement. It is driver responsibility to block
> the queue if the available slots in TX ring would not allow the
> following packet to be sent.
> 
> Thats why a TSO emulation needs to set gso_max_segs to some sane value
> 
> (sfc sets it to 100)
> 
> In Fugang patch this part was a wrong way to deal with this :
> 
> +       if (tso_count_descs(skb) >= fec_enet_txdesc_entry_free(fep)) {
> +               if (net_ratelimit())
> +                       netdev_err(ndev, "tx queue full!\n");
> +               return NETDEV_TX_BUSY;
> +       }
> +
> 
> This was copy/pasted from buggy
> drivers/net/ethernet/marvell/mv643xx_eth.c
> 

Indeed. I'm just about to submit the fixes for those, which may be useful to
fix it in this driver as well.
diff mbox

Patch

diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h
index 3b8d6d1..74fbd49 100644
--- a/drivers/net/ethernet/freescale/fec.h
+++ b/drivers/net/ethernet/freescale/fec.h
@@ -240,14 +240,14 @@  struct bufdesc_ex {
  * the skbuffer directly.
  */
 
-#define FEC_ENET_RX_PAGES	8
+#define FEC_ENET_RX_PAGES	256
 #define FEC_ENET_RX_FRSIZE	2048
 #define FEC_ENET_RX_FRPPG	(PAGE_SIZE / FEC_ENET_RX_FRSIZE)
 #define RX_RING_SIZE		(FEC_ENET_RX_FRPPG * FEC_ENET_RX_PAGES)
 #define FEC_ENET_TX_FRSIZE	2048
 #define FEC_ENET_TX_FRPPG	(PAGE_SIZE / FEC_ENET_TX_FRSIZE)
-#define TX_RING_SIZE		16	/* Must be power of two */
-#define TX_RING_MOD_MASK	15	/*   for this to work */
+#define TX_RING_SIZE		512	/* Must be power of two */
+#define TX_RING_MOD_MASK	511	/*   for this to work */
 
 #define BD_ENET_RX_INT          0x00800000
 #define BD_ENET_RX_PTP          ((ushort)0x0400)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 79f8ac0..714100f 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -172,10 +172,6 @@  MODULE_PARM_DESC(macaddr, "FEC Ethernet MAC address");
 #endif
 #endif /* CONFIG_M5272 */
 
-#if (((RX_RING_SIZE + TX_RING_SIZE) * 32) > PAGE_SIZE)
-#error "FEC: descriptor ring size constants too large"
-#endif
-
 /* Interrupt events/masks. */
 #define FEC_ENET_HBERR	((uint)0x80000000)	/* Heartbeat error */
 #define FEC_ENET_BABR	((uint)0x40000000)	/* Babbling receiver */
@@ -2060,9 +2056,20 @@  static int fec_enet_init(struct net_device *ndev)
 	const struct platform_device_id *id_entry =
 				platform_get_device_id(fep->pdev);
 	struct bufdesc *cbd_base;
+	int bd_size;
+
+	/* init the tx & rx ring size */
+	fep->tx_ring_size = TX_RING_SIZE;
+	fep->rx_ring_size = RX_RING_SIZE;
+
+	if (fep->bufdesc_ex)
+		bd_size = sizeof(struct bufdesc_ex);
+	else
+		bd_size = sizeof(struct bufdesc);
+	bd_size *= (fep->tx_ring_size + fep->rx_ring_size);
 
 	/* Allocate memory for buffer descriptors. */
-	cbd_base = dma_alloc_coherent(NULL, PAGE_SIZE, &fep->bd_dma,
+	cbd_base = dma_alloc_coherent(NULL, bd_size, &fep->bd_dma,
 				      GFP_KERNEL);
 	if (!cbd_base)
 		return -ENOMEM;
@@ -2076,10 +2083,6 @@  static int fec_enet_init(struct net_device *ndev)
 	/* make sure MAC we just acquired is programmed into the hw */
 	fec_set_mac_address(ndev, NULL);
 
-	/* init the tx & rx ring size */
-	fep->tx_ring_size = TX_RING_SIZE;
-	fep->rx_ring_size = RX_RING_SIZE;
-
 	/* Set receive and transmit descriptor base. */
 	fep->rx_bd_base = cbd_base;
 	if (fep->bufdesc_ex)