Patchwork mlx4_en: add UFO support

login
register
mail settings
Submitter Thadeu Lima de Souza Cascardo
Date Aug. 2, 2012, 8:53 p.m.
Message ID <1343940824-4720-1-git-send-email-cascardo@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/174813/
State Changes Requested
Delegated to: David Miller
Headers show

Comments

Thadeu Lima de Souza Cascardo - Aug. 2, 2012, 8:53 p.m.
Mellanox Ethernet adapters support Large Segmentation Offload for UDP
packets. The only change needed is using the proper header size when the
packet is UDP instead of TCP.

This significantly increases performance for large UDP packets on
platforms which have an expensive dma_map call, like pseries.

On a simple test with 64000 payload size, throughput has increased from
about 6Gbps to 9.5Gbps, while CPU use dropped from about 600% to about
80% or less, on a 8-core Power7 machine.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c |    2 +-
 drivers/net/ethernet/mellanox/mlx4/en_tx.c     |    7 ++++++-
 2 files changed, 7 insertions(+), 2 deletions(-)
Yevgeny Petrilin - Aug. 3, 2012, 8:29 a.m.
> 
> Mellanox Ethernet adapters support Large Segmentation Offload for UDP
> packets. The only change needed is using the proper header size when the
> packet is UDP instead of TCP.
> 
> This significantly increases performance for large UDP packets on platforms
> which have an expensive dma_map call, like pseries.
> 
> On a simple test with 64000 payload size, throughput has increased from
> about 6Gbps to 9.5Gbps, while CPU use dropped from about 600% to about
> 80% or less, on a 8-core Power7 machine.
> 
Hi Thadeu,
Can you please send the info regarding the adapter you are testing with? What test are you running?
I just tried this patch with netperf on my x86_64, and it doesn't work. Packets are not fragmented properly (fragment offsets are not calculated).
It is true that the TX side doesn't work as hard (OS doesn't need to do the fragmentation), but traffic is not sent properly on the wire.

I'll do further investigation and get back with more details.

Yevgeny
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thadeu Lima de Souza Cascardo - Aug. 3, 2012, 1:34 p.m.
On Fri, Aug 03, 2012 at 08:29:26AM +0000, Yevgeny Petrilin wrote:
> > 
> > Mellanox Ethernet adapters support Large Segmentation Offload for UDP
> > packets. The only change needed is using the proper header size when the
> > packet is UDP instead of TCP.
> > 
> > This significantly increases performance for large UDP packets on platforms
> > which have an expensive dma_map call, like pseries.
> > 
> > On a simple test with 64000 payload size, throughput has increased from
> > about 6Gbps to 9.5Gbps, while CPU use dropped from about 600% to about
> > 80% or less, on a 8-core Power7 machine.
> > 
> Hi Thadeu,
> Can you please send the info regarding the adapter you are testing with? What test are you running?
> I just tried this patch with netperf on my x86_64, and it doesn't work. Packets are not fragmented properly (fragment offsets are not calculated).
> It is true that the TX side doesn't work as hard (OS doesn't need to do the fragmentation), but traffic is not sent properly on the wire.
> 
> I'll do further investigation and get back with more details.
> 
> Yevgeny
> 

Hi, Yevgeny.

At first, I only added the UFO feature. When testing that, I got lots of
errors on the receiving end, like:

UDP: short packet: From 10.0.0.2:0 0/1480 to 10.0.0.3:0

After finding out what the driver was writing to the LSO descriptor, it
was obvious why this happened. The driver was using as the header size
the TCP header size, which would use a value from the UDP packet
payload.

After the other change, however, everything should work fine. I ran a
uperf test with 64000-sized payload packets and everything seemed to
work fine.

The card I have in here is:

0001:01:00.0 Ethernet controller: Mellanox Technologies MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s] (rev b0)
        Subsystem: Mellanox Technologies Device 0016
        Flags: bus master, fast devsel, latency 0, IRQ 17
        Memory at 3da0fbe00000 (64-bit, non-prefetchable) [size=1M]
        Memory at 3da0fc000000 (64-bit, prefetchable) [size=32M]
        Expansion ROM at 3da0fbf00000 [disabled] [size=1M]
        Capabilities: [40] Power Management version 3
        Capabilities: [48] Vital Product Data
        Capabilities: [9c] MSI-X: Enable+ Count=128 Masked-
        Capabilities: [60] Express Endpoint, MSI 00
        Capabilities: [100] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [148] Device Serial Number 00-02-c9-03-00-4b-97-c4
        Kernel driver in use: mlx4_core
        Kernel modules: mlx4_core

I will try some other tests in here and report my results.

Regards.
Cascardo.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thadeu Lima de Souza Cascardo - Aug. 3, 2012, 1:54 p.m.
On Fri, Aug 03, 2012 at 08:29:26AM +0000, Yevgeny Petrilin wrote:
> > 
> > Mellanox Ethernet adapters support Large Segmentation Offload for UDP
> > packets. The only change needed is using the proper header size when the
> > packet is UDP instead of TCP.
> > 
> > This significantly increases performance for large UDP packets on platforms
> > which have an expensive dma_map call, like pseries.
> > 
> > On a simple test with 64000 payload size, throughput has increased from
> > about 6Gbps to 9.5Gbps, while CPU use dropped from about 600% to about
> > 80% or less, on a 8-core Power7 machine.
> > 
> Hi Thadeu,
> Can you please send the info regarding the adapter you are testing with? What test are you running?
> I just tried this patch with netperf on my x86_64, and it doesn't work. Packets are not fragmented properly (fragment offsets are not calculated).
> It is true that the TX side doesn't work as hard (OS doesn't need to do the fragmentation), but traffic is not sent properly on the wire.
> 
> I'll do further investigation and get back with more details.
> 
> Yevgeny
> 

Hi, Yevgeny.

You are right. After generating a dump on the receiving end, and sending
a single large packet, I did notice that fragment offsets were all 0,
more fragments flag is not set and id is incremented.

Does the hardware really support UFO as documented? Should we just write
the IP header to the descriptor? I will try this and report my results.
Meanwhile, can you find out what is needed to get this working?

Thanks a lot.
Cascardo.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index edd9cb8..59e808a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1660,7 +1660,7 @@  int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
 	 */
 	dev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM;
 	if (mdev->LSO_support)
-		dev->hw_features |= NETIF_F_TSO | NETIF_F_TSO6;
+		dev->hw_features |= NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_UFO;
 
 	dev->vlan_features = dev->hw_features;
 
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 019d856..2aad5a4 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -39,6 +39,7 @@ 
 #include <linux/if_vlan.h>
 #include <linux/vmalloc.h>
 #include <linux/tcp.h>
+#include <linux/udp.h>
 #include <linux/moduleparam.h>
 
 #include "mlx4_en.h"
@@ -455,7 +456,11 @@  static int get_real_size(struct sk_buff *skb, struct net_device *dev,
 	int real_size;
 
 	if (skb_is_gso(skb)) {
-		*lso_header_size = skb_transport_offset(skb) + tcp_hdrlen(skb);
+		*lso_header_size = skb_transport_offset(skb);
+		if (skb_shinfo(skb)->gso_type == SKB_GSO_UDP)
+			*lso_header_size += sizeof(struct udphdr);
+		else
+			*lso_header_size += tcp_hdrlen(skb);
 		real_size = CTRL_SIZE + skb_shinfo(skb)->nr_frags * DS_SIZE +
 			ALIGN(*lso_header_size + 4, DS_SIZE);
 		if (unlikely(*lso_header_size != skb_headlen(skb))) {