[ovs-dev] 答复: [PATCH v4 0/3] Add support for TSO with DPDK
diff mbox series

Message ID 4e35a3d810ac4ca3b420d8cd053ca45c@inspur.com
State New
Headers show
Series
  • [ovs-dev] 答复: [PATCH v4 0/3] Add support for TSO with DPDK
Related show

Commit Message

Yi Yang (杨燚)-云服务集团 Feb. 20, 2020, 10:10 a.m. UTC
Hi, Flavio

I find this tso feature doesn't work normally on my Ubuntu 16.04, here is my
result. My kernel version is 

$ uname -a
Linux cmp008 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4 09:03:09 UTC
2019 x86_64 x86_64 x86_64 GNU/Linux
$

$ ./run-iperf3.sh
Connecting to host 10.15.1.3, port 5201
[  4] local 10.15.1.2 port 56466 connected to 10.15.1.3 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-10.00  sec  7.05 MBytes  5.91 Mbits/sec  2212   5.66 KBytes
[  4]  10.00-20.00  sec  7.67 MBytes  6.44 Mbits/sec  2484   5.66 KBytes
[  4]  20.00-30.00  sec  7.77 MBytes  6.52 Mbits/sec  2500   5.66 KBytes
[  4]  30.00-40.00  sec  7.77 MBytes  6.52 Mbits/sec  2490   5.66 KBytes
[  4]  40.00-50.00  sec  7.76 MBytes  6.51 Mbits/sec  2500   5.66 KBytes
[  4]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec  2504   5.66 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-60.00  sec  45.8 MBytes  6.40 Mbits/sec  14690
sender
[  4]   0.00-60.00  sec  45.7 MBytes  6.40 Mbits/sec
receiver

Server output:
Accepted connection from 10.15.1.2, port 56464
[  5] local 10.15.1.3 port 5201 connected to 10.15.1.2 port 56466
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-10.00  sec  6.90 MBytes  5.79 Mbits/sec
[  5]  10.00-20.00  sec  7.71 MBytes  6.47 Mbits/sec
[  5]  20.00-30.00  sec  7.73 MBytes  6.48 Mbits/sec
[  5]  30.00-40.00  sec  7.79 MBytes  6.53 Mbits/sec
[  5]  40.00-50.00  sec  7.79 MBytes  6.53 Mbits/sec
[  5]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec


iperf Done.
$

But it does work for tap, I'm not sure if it is a kernel issue, which kernel
version are you using? I didn't use tpacket_v3 patch. Here is my local ovs
info.

$ git log
commit 1223cf123ed141c0a0110ebed17572bdb2e3d0f4
Author: Ilya Maximets <i.maximets@ovn.org>
Date:   Thu Feb 6 14:24:23 2020 +0100

    netdev-dpdk: Don't enable offloading on HW device if not requested.

    DPDK drivers has different implementations of transmit functions.
    Enabled offloading may cause driver to choose slower variant
    significantly affecting performance if userspace TSO wasn't requested.

    Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
    Reported-by: David Marchand <david.marchand@redhat.com>
    Acked-by: David Marchand <david.marchand@redhat.com>
    Acked-by: Flavio Leitner <fbl@sysclose.org>
    Acked-by: Kevin Traynor <ktraynor@redhat.com>
    Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

commit 73858f9dbe83daf8cc8d4b604acc23eb62cc3f52
Author: Flavio Leitner <fbl@sysclose.org>
Date:   Mon Feb 3 18:45:50 2020 -0300

    netdev-linux: Prepend the std packet in the TSO packet

    Usually TSO packets are close to 50k, 60k bytes long, so to
    to copy less bytes when receiving a packet from the kernel
    change the approach. Instead of extending the MTU sized
    packet received and append with remaining TSO data from
    the TSO buffer, allocate a TSO packet with enough headroom
    to prepend the std packet data.

    Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
    Suggested-by: Ben Pfaff <blp@ovn.org>
    Signed-off-by: Flavio Leitner <fbl@sysclose.org>
    Signed-off-by: Ben Pfaff <blp@ovn.org>

commit 2297cbe6cc25b6b1862c499ce8f16f52f75d9e5f
Author: Flavio Leitner <fbl@sysclose.org>
Date:   Mon Feb 3 11:22:22 2020 -0300

    netdev-linux-private: fix max length to be 16 bits

    The dp_packet length is limited to 16 bits, so document that
    and fix the length value accordingly.

    Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
    Signed-off-by: Flavio Leitner <fbl@sysclose.org>
    Signed-off-by: Ben Pfaff <blp@ovn.org>

commit 3d6a6f450af5b7eaf4b532983cb14458ae792b72
Author: David Marchand <david.marchand@redhat.com>
Date:   Tue Feb 4 22:28:26 2020 +0100

    netdev-dpdk: Fix port init when lacking Tx offloads for TSO.

    The check on TSO capability did not ensure ip checksum, tcp checksum and
    TSO tx offloads were available which resulted in a port init failure
    (example below with a ena device):

    *2020-02-04T17:42:52.976Z|00084|dpdk|ERR|Ethdev port_id=0 requested Tx
    offloads 0x2a doesn't match Tx offloads capabilities 0xe in
    rte_eth_dev_configure()*

    Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")

    Reported-by: Ravi Kerur <rkerur@gmail.com>
    Signed-off-by: David Marchand <david.marchand@redhat.com>
    Acked-by: Kevin Traynor <ktraynor@redhat.com>
    Acked-by: Flavio Leitner <fbl@sysclose.org>
    Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

commit 8e371aa497aa95e3562d53f566c2d634b4b0f589
Author: Kirill A. Kornilov <kirill@tp>
Date:   Mon Jan 13 12:29:10 2020 +0300

    vswitchd: Add serial number configuration.

    Signed-off-by: Kirill A. Kornilov <kornilov@zelax.ru>
    Signed-off-by: Ben Pfaff <blp@ovn.org>

I applied your tap patch.

$ git diff

$

Here is performance result for tap.

$ ./run-iperf3.sh
Connecting to host 10.15.1.3, port 5201
[  4] local 10.15.1.2 port 56480 connected to 10.15.1.3 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-10.00  sec  19.4 GBytes  16.7 Gbits/sec    0   3.05 MBytes
[  4]  10.00-20.00  sec  18.3 GBytes  15.7 Gbits/sec    0   3.05 MBytes
[  4]  20.00-30.00  sec  17.6 GBytes  15.1 Gbits/sec    0   3.05 MBytes
[  4]  30.00-40.00  sec  19.3 GBytes  16.6 Gbits/sec    0   3.05 MBytes
[  4]  40.00-50.00  sec  18.8 GBytes  16.1 Gbits/sec    0   3.05 MBytes
[  4]  50.00-60.00  sec  17.9 GBytes  15.4 Gbits/sec    0   3.05 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-60.00  sec   111 GBytes  15.9 Gbits/sec    0             sender
[  4]   0.00-60.00  sec   111 GBytes  15.9 Gbits/sec
receiver

Server output:
Accepted connection from 10.15.1.2, port 56478
[  5] local 10.15.1.3 port 5201 connected to 10.15.1.2 port 56480
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-10.00  sec  19.3 GBytes  16.6 Gbits/sec
[  5]  10.00-20.00  sec  18.3 GBytes  15.7 Gbits/sec
[  5]  20.00-30.00  sec  17.6 GBytes  15.1 Gbits/sec
[  5]  30.00-40.00  sec  19.3 GBytes  16.6 Gbits/sec
[  5]  40.00-50.00  sec  18.8 GBytes  16.1 Gbits/sec
[  5]  50.00-60.00  sec  17.9 GBytes  15.4 Gbits/sec


iperf Done.
$

Comments

Flavio Leitner Feb. 20, 2020, 1:41 p.m. UTC | #1
On Thu, Feb 20, 2020 at 10:10:36AM +0000, Yi Yang (杨�D)-云服务集团 wrote:
> Hi, Flavio
> 
> I find this tso feature doesn't work normally on my Ubuntu 16.04, here is my
> result. My kernel version is 
> 
> $ uname -a
> Linux cmp008 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4 09:03:09 UTC
> 2019 x86_64 x86_64 x86_64 GNU/Linux
> $

I tested with 4.15.0 upstream and it worked. Can you do the same?

> $ ./run-iperf3.sh
> Connecting to host 10.15.1.3, port 5201
> [  4] local 10.15.1.2 port 56466 connected to 10.15.1.3 port 5201
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-10.00  sec  7.05 MBytes  5.91 Mbits/sec  2212   5.66 KBytes
> [  4]  10.00-20.00  sec  7.67 MBytes  6.44 Mbits/sec  2484   5.66 KBytes
> [  4]  20.00-30.00  sec  7.77 MBytes  6.52 Mbits/sec  2500   5.66 KBytes
> [  4]  30.00-40.00  sec  7.77 MBytes  6.52 Mbits/sec  2490   5.66 KBytes
> [  4]  40.00-50.00  sec  7.76 MBytes  6.51 Mbits/sec  2500   5.66 KBytes
> [  4]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec  2504   5.66 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-60.00  sec  45.8 MBytes  6.40 Mbits/sec  14690
> sender
> [  4]   0.00-60.00  sec  45.7 MBytes  6.40 Mbits/sec
> receiver

That looks like TSO packets are being dropped and the traffic is
basically TCP retransmissions of MTU size.

fbl


> 
> Server output:
> Accepted connection from 10.15.1.2, port 56464
> [  5] local 10.15.1.3 port 5201 connected to 10.15.1.2 port 56466
> [ ID] Interval           Transfer     Bandwidth
> [  5]   0.00-10.00  sec  6.90 MBytes  5.79 Mbits/sec
> [  5]  10.00-20.00  sec  7.71 MBytes  6.47 Mbits/sec
> [  5]  20.00-30.00  sec  7.73 MBytes  6.48 Mbits/sec
> [  5]  30.00-40.00  sec  7.79 MBytes  6.53 Mbits/sec
> [  5]  40.00-50.00  sec  7.79 MBytes  6.53 Mbits/sec
> [  5]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec
> 
> 
> iperf Done.
> $
> 
> But it does work for tap, I'm not sure if it is a kernel issue, which kernel
> version are you using? I didn't use tpacket_v3 patch. Here is my local ovs
> info.
> 
> $ git log
> commit 1223cf123ed141c0a0110ebed17572bdb2e3d0f4
> Author: Ilya Maximets <i.maximets@ovn.org>
> Date:   Thu Feb 6 14:24:23 2020 +0100
> 
>     netdev-dpdk: Don't enable offloading on HW device if not requested.
> 
>     DPDK drivers has different implementations of transmit functions.
>     Enabled offloading may cause driver to choose slower variant
>     significantly affecting performance if userspace TSO wasn't requested.
> 
>     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
>     Reported-by: David Marchand <david.marchand@redhat.com>
>     Acked-by: David Marchand <david.marchand@redhat.com>
>     Acked-by: Flavio Leitner <fbl@sysclose.org>
>     Acked-by: Kevin Traynor <ktraynor@redhat.com>
>     Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> 
> commit 73858f9dbe83daf8cc8d4b604acc23eb62cc3f52
> Author: Flavio Leitner <fbl@sysclose.org>
> Date:   Mon Feb 3 18:45:50 2020 -0300
> 
>     netdev-linux: Prepend the std packet in the TSO packet
> 
>     Usually TSO packets are close to 50k, 60k bytes long, so to
>     to copy less bytes when receiving a packet from the kernel
>     change the approach. Instead of extending the MTU sized
>     packet received and append with remaining TSO data from
>     the TSO buffer, allocate a TSO packet with enough headroom
>     to prepend the std packet data.
> 
>     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
>     Suggested-by: Ben Pfaff <blp@ovn.org>
>     Signed-off-by: Flavio Leitner <fbl@sysclose.org>
>     Signed-off-by: Ben Pfaff <blp@ovn.org>
> 
> commit 2297cbe6cc25b6b1862c499ce8f16f52f75d9e5f
> Author: Flavio Leitner <fbl@sysclose.org>
> Date:   Mon Feb 3 11:22:22 2020 -0300
> 
>     netdev-linux-private: fix max length to be 16 bits
> 
>     The dp_packet length is limited to 16 bits, so document that
>     and fix the length value accordingly.
> 
>     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
>     Signed-off-by: Flavio Leitner <fbl@sysclose.org>
>     Signed-off-by: Ben Pfaff <blp@ovn.org>
> 
> commit 3d6a6f450af5b7eaf4b532983cb14458ae792b72
> Author: David Marchand <david.marchand@redhat.com>
> Date:   Tue Feb 4 22:28:26 2020 +0100
> 
>     netdev-dpdk: Fix port init when lacking Tx offloads for TSO.
> 
>     The check on TSO capability did not ensure ip checksum, tcp checksum and
>     TSO tx offloads were available which resulted in a port init failure
>     (example below with a ena device):
> 
>     *2020-02-04T17:42:52.976Z|00084|dpdk|ERR|Ethdev port_id=0 requested Tx
>     offloads 0x2a doesn't match Tx offloads capabilities 0xe in
>     rte_eth_dev_configure()*
> 
>     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
> 
>     Reported-by: Ravi Kerur <rkerur@gmail.com>
>     Signed-off-by: David Marchand <david.marchand@redhat.com>
>     Acked-by: Kevin Traynor <ktraynor@redhat.com>
>     Acked-by: Flavio Leitner <fbl@sysclose.org>
>     Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> 
> commit 8e371aa497aa95e3562d53f566c2d634b4b0f589
> Author: Kirill A. Kornilov <kirill@tp>
> Date:   Mon Jan 13 12:29:10 2020 +0300
> 
>     vswitchd: Add serial number configuration.
> 
>     Signed-off-by: Kirill A. Kornilov <kornilov@zelax.ru>
>     Signed-off-by: Ben Pfaff <blp@ovn.org>
> 
> I applied your tap patch.
> 
> $ git diff
> diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
> index c6f3d27..74a5728 100644
> --- a/lib/netdev-linux.c
> +++ b/lib/netdev-linux.c
> @@ -1010,6 +1010,23 @@ netdev_linux_construct_tap(struct netdev *netdev_)
>          goto error_close;
>      }
> 
> +    if (userspace_tso_enabled()) {
> +        /* Old kernels don't support TUNSETOFFLOAD. If TUNSETOFFLOAD is
> +         * available, it will return EINVAL when a flag is unknown.
> +         * Therefore, try enabling offload with no flags to check
> +         * if TUNSETOFFLOAD support is available or not. */
> +        if (ioctl(netdev->tap_fd, TUNSETOFFLOAD, 0) == 0 || errno !=
> EINVAL) {
> +            unsigned long oflags = TUN_F_CSUM | TUN_F_TSO4 | TUN_F_TSO6;
> +
> +            if (ioctl(netdev->tap_fd, TUNSETOFFLOAD, oflags) == -1) {
> +                VLOG_WARN("%s: enabling tap offloading failed: %s", name,
> +                          ovs_strerror(errno));
> +                error = errno;
> +                goto error_close;
> +            }
> +        }
> +    }
> +
>      netdev->present = true;
>      return 0;
> 
> $
> 
> Here is performance result for tap.
> 
> $ ./run-iperf3.sh
> Connecting to host 10.15.1.3, port 5201
> [  4] local 10.15.1.2 port 56480 connected to 10.15.1.3 port 5201
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-10.00  sec  19.4 GBytes  16.7 Gbits/sec    0   3.05 MBytes
> [  4]  10.00-20.00  sec  18.3 GBytes  15.7 Gbits/sec    0   3.05 MBytes
> [  4]  20.00-30.00  sec  17.6 GBytes  15.1 Gbits/sec    0   3.05 MBytes
> [  4]  30.00-40.00  sec  19.3 GBytes  16.6 Gbits/sec    0   3.05 MBytes
> [  4]  40.00-50.00  sec  18.8 GBytes  16.1 Gbits/sec    0   3.05 MBytes
> [  4]  50.00-60.00  sec  17.9 GBytes  15.4 Gbits/sec    0   3.05 MBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-60.00  sec   111 GBytes  15.9 Gbits/sec    0             sender
> [  4]   0.00-60.00  sec   111 GBytes  15.9 Gbits/sec
> receiver
> 
> Server output:
> Accepted connection from 10.15.1.2, port 56478
> [  5] local 10.15.1.3 port 5201 connected to 10.15.1.2 port 56480
> [ ID] Interval           Transfer     Bandwidth
> [  5]   0.00-10.00  sec  19.3 GBytes  16.6 Gbits/sec
> [  5]  10.00-20.00  sec  18.3 GBytes  15.7 Gbits/sec
> [  5]  20.00-30.00  sec  17.6 GBytes  15.1 Gbits/sec
> [  5]  30.00-40.00  sec  19.3 GBytes  16.6 Gbits/sec
> [  5]  40.00-50.00  sec  18.8 GBytes  16.1 Gbits/sec
> [  5]  50.00-60.00  sec  17.9 GBytes  15.4 Gbits/sec
> 
> 
> iperf Done.
> $
William Tu Feb. 20, 2020, 7:20 p.m. UTC | #2
On Thu, Feb 20, 2020 at 2:12 AM Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com> wrote:
>
> Hi, Flavio
>
> I find this tso feature doesn't work normally on my Ubuntu 16.04, here is my
> result. My kernel version is

Hi Yiyang,

I'm so confused with your description. Which case does not work for you?
Yifeng and Flavio were using OVS-DPDK with vhostuser to VM, is this
the case you're talking about?

>
> $ uname -a
> Linux cmp008 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4 09:03:09 UTC
> 2019 x86_64 x86_64 x86_64 GNU/Linux
> $
>
> $ ./run-iperf3.sh

Which case is this one?

> Connecting to host 10.15.1.3, port 5201
> [  4] local 10.15.1.2 port 56466 connected to 10.15.1.3 port 5201
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-10.00  sec  7.05 MBytes  5.91 Mbits/sec  2212   5.66 KBytes
> [  4]  10.00-20.00  sec  7.67 MBytes  6.44 Mbits/sec  2484   5.66 KBytes
> [  4]  20.00-30.00  sec  7.77 MBytes  6.52 Mbits/sec  2500   5.66 KBytes
> [  4]  30.00-40.00  sec  7.77 MBytes  6.52 Mbits/sec  2490   5.66 KBytes
> [  4]  40.00-50.00  sec  7.76 MBytes  6.51 Mbits/sec  2500   5.66 KBytes
> [  4]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec  2504   5.66 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-60.00  sec  45.8 MBytes  6.40 Mbits/sec  14690
> sender
> [  4]   0.00-60.00  sec  45.7 MBytes  6.40 Mbits/sec
> receiver
>
> Server output:
> Accepted connection from 10.15.1.2, port 56464
> [  5] local 10.15.1.3 port 5201 connected to 10.15.1.2 port 56466
> [ ID] Interval           Transfer     Bandwidth
> [  5]   0.00-10.00  sec  6.90 MBytes  5.79 Mbits/sec
> [  5]  10.00-20.00  sec  7.71 MBytes  6.47 Mbits/sec
> [  5]  20.00-30.00  sec  7.73 MBytes  6.48 Mbits/sec
> [  5]  30.00-40.00  sec  7.79 MBytes  6.53 Mbits/sec
> [  5]  40.00-50.00  sec  7.79 MBytes  6.53 Mbits/sec
> [  5]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec
>
>
> iperf Done.
> $
>
> But it does work for tap, I'm not sure if it is a kernel issue, which kernel
  ^^^^^^^^^^^^^^^^^^^^^^^
So which case does not work?

> version are you using? I didn't use tpacket_v3 patch. Here is my local ovs
> info.

William
Yi Yang (杨燚)-云服务集团 Feb. 21, 2020, 12:46 a.m. UTC | #3
No, I didn't use VMs, just veth in netns, I doubt it is Ubuntu kernel bug.

-----邮件原件-----
发件人: William Tu [mailto:u9012063@gmail.com] 
发送时间: 2020年2月21日 3:21
收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
抄送: fbl@sysclose.org; pkusunyifeng@gmail.com; dev@openvswitch.org; i.maximets@ovn.org; txfh2007@aliyun.com
主题: Re: [ovs-dev] 答复: [PATCH v4 0/3] Add support for TSO with DPDK

On Thu, Feb 20, 2020 at 2:12 AM Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com> wrote:
>
> Hi, Flavio
>
> I find this tso feature doesn't work normally on my Ubuntu 16.04, here 
> is my result. My kernel version is

Hi Yiyang,

I'm so confused with your description. Which case does not work for you?
Yifeng and Flavio were using OVS-DPDK with vhostuser to VM, is this the case you're talking about?

>
> $ uname -a
> Linux cmp008 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4 
> 09:03:09 UTC
> 2019 x86_64 x86_64 x86_64 GNU/Linux
> $
>
> $ ./run-iperf3.sh

Which case is this one?

> Connecting to host 10.15.1.3, port 5201 [  4] local 10.15.1.2 port 
> 56466 connected to 10.15.1.3 port 5201
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-10.00  sec  7.05 MBytes  5.91 Mbits/sec  2212   5.66 KBytes
> [  4]  10.00-20.00  sec  7.67 MBytes  6.44 Mbits/sec  2484   5.66 KBytes
> [  4]  20.00-30.00  sec  7.77 MBytes  6.52 Mbits/sec  2500   5.66 KBytes
> [  4]  30.00-40.00  sec  7.77 MBytes  6.52 Mbits/sec  2490   5.66 KBytes
> [  4]  40.00-50.00  sec  7.76 MBytes  6.51 Mbits/sec  2500   5.66 KBytes
> [  4]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec  2504   5.66 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-60.00  sec  45.8 MBytes  6.40 Mbits/sec  14690
> sender
> [  4]   0.00-60.00  sec  45.7 MBytes  6.40 Mbits/sec
> receiver
>
> Server output:
> Accepted connection from 10.15.1.2, port 56464 [  5] local 10.15.1.3 
> port 5201 connected to 10.15.1.2 port 56466
> [ ID] Interval           Transfer     Bandwidth
> [  5]   0.00-10.00  sec  6.90 MBytes  5.79 Mbits/sec
> [  5]  10.00-20.00  sec  7.71 MBytes  6.47 Mbits/sec [  5]  
> 20.00-30.00  sec  7.73 MBytes  6.48 Mbits/sec [  5]  30.00-40.00  sec  
> 7.79 MBytes  6.53 Mbits/sec [  5]  40.00-50.00  sec  7.79 MBytes  6.53 
> Mbits/sec [  5]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec
>
>
> iperf Done.
> $
>
> But it does work for tap, I'm not sure if it is a kernel issue, which 
> kernel
  ^^^^^^^^^^^^^^^^^^^^^^^
So which case does not work?

> version are you using? I didn't use tpacket_v3 patch. Here is my local 
> ovs info.

William
Yi Yang (杨燚)-云服务集团 Feb. 21, 2020, 3:06 a.m. UTC | #4
Very weird, I built 4.15.9, it is from upstream kernel, the result is same, what's wrong? I can't understand. I directly used current ovs master for this time.

$ ./run-iperf3.sh
Connecting to host 10.15.1.3, port 5201
[  4] local 10.15.1.2 port 54078 connected to 10.15.1.3 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-10.00  sec  6.01 MBytes  5.04 Mbits/sec  1688   5.66 KBytes
[  4]  10.00-20.00  sec  6.17 MBytes  5.17 Mbits/sec  1725   7.07 KBytes
[  4]  20.00-30.00  sec  6.51 MBytes  5.46 Mbits/sec  1828   5.66 KBytes
[  4]  30.00-40.00  sec  5.58 MBytes  4.68 Mbits/sec  1509   7.07 KBytes
[  4]  40.00-50.00  sec  4.83 MBytes  4.05 Mbits/sec  1182   7.07 KBytes
[  4]  50.00-60.00  sec  4.49 MBytes  3.77 Mbits/sec  1110   5.66 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-60.00  sec  33.6 MBytes  4.70 Mbits/sec  9042             sender
[  4]   0.00-60.00  sec  33.5 MBytes  4.69 Mbits/sec                  receiver

Server output:
Accepted connection from 10.15.1.2, port 54076
[  5] local 10.15.1.3 port 5201 connected to 10.15.1.2 port 54078
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-10.00  sec  5.89 MBytes  4.94 Mbits/sec
[  5]  10.00-20.00  sec  6.23 MBytes  5.22 Mbits/sec
[  5]  20.00-30.00  sec  6.46 MBytes  5.42 Mbits/sec
[  5]  30.00-40.00  sec  5.62 MBytes  4.71 Mbits/sec
[  5]  40.00-50.00  sec  4.83 MBytes  4.05 Mbits/sec
[  5]  50.00-60.00  sec  4.45 MBytes  3.73 Mbits/sec


iperf Done.
$ uname -a
Linux cmp008 4.15.9 #1 SMP Fri Feb 21 09:27:41 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
eipadmin@cmp008:~$
eipadmin@cmp008:~$ cd ovs-master/
eipadmin@cmp008:~/ovs-master$ git log
commit ac23d20fc90da3b1c9b2117d1e22102e99fba006
Author: Yi-Hung Wei <yihung.wei@gmail.com>
Date:   Fri Feb 7 14:55:06 2020 -0800

    conntrack: Fix TCP conntrack state

    If a TCP connection is in SYN_SENT state, receiving another SYN packet
    would just renew the timeout of that conntrack entry rather than create
    a new one.  Thus, tcp_conn_update() should return CT_UPDATE_VALID_NEW.

    This also fixes regressions of a couple of  OVN system tests.

    Fixes: a867c010ee91 ("conntrack: Fix conntrack new state")
    Reported-by: Dumitru Ceara <dceara@redhat.com>
    Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
    Tested-by: Dumitru Ceara <dceara@redhat.com>
    Signed-off-by: William Tu <u9012063@gmail.com>

commit 486139d9e4b81dae04b2bb7487d45366865ac0ad
Author: Tomasz Konieczny <tomaszx.konieczny@intel.com>
Date:   Wed Feb 12 14:15:56 2020 +0100

    docs: Update DPDK version table

    Signed-off-by: Tomasz Konieczny <tomaszx.konieczny@intel.com>
    Acked-by: Flavio Leitner <fbl@sysclose.org>
    Acked-by: Kevin Traynor <ktraynor@redhat.com>
    Signed-off-by: Ian Stokes <ian.stokes@intel.com>

commit 9efbdaa201530ab7023a69176aba54c32c468efb
Author: Ben Pfaff <blp@ovn.org>
Date:   Thu Feb 13 16:27:01 2020 -0800

    Set release date for 2.13.0.

    The "Valentine's Day" release.

    Acked-by: Flavio Leitner <fbl@sysclose.org>
    Signed-off-by: Ben Pfaff <blp@ovn.org>

commit 19e99c83bb4da4617730f20392515d8aca5b61ba
Author: Yi-Hung Wei <yihung.wei@gmail.com>
$

-----邮件原件-----
发件人: Flavio Leitner [mailto:fbl@sysclose.org] 
发送时间: 2020年2月20日 21:41
收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
抄送: pkusunyifeng@gmail.com; dev@openvswitch.org; i.maximets@ovn.org; txfh2007@aliyun.com
主题: Re: 答复: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK

On Thu, Feb 20, 2020 at 10:10:36AM +0000, Yi Yang (杨 D)-云服务集团 wrote:
> Hi, Flavio
> 
> I find this tso feature doesn't work normally on my Ubuntu 16.04, here 
> is my result. My kernel version is
> 
> $ uname -a
> Linux cmp008 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4 
> 09:03:09 UTC
> 2019 x86_64 x86_64 x86_64 GNU/Linux
> $

I tested with 4.15.0 upstream and it worked. Can you do the same?

> $ ./run-iperf3.sh
> Connecting to host 10.15.1.3, port 5201 [  4] local 10.15.1.2 port 
> 56466 connected to 10.15.1.3 port 5201
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-10.00  sec  7.05 MBytes  5.91 Mbits/sec  2212   5.66 KBytes
> [  4]  10.00-20.00  sec  7.67 MBytes  6.44 Mbits/sec  2484   5.66 KBytes
> [  4]  20.00-30.00  sec  7.77 MBytes  6.52 Mbits/sec  2500   5.66 KBytes
> [  4]  30.00-40.00  sec  7.77 MBytes  6.52 Mbits/sec  2490   5.66 KBytes
> [  4]  40.00-50.00  sec  7.76 MBytes  6.51 Mbits/sec  2500   5.66 KBytes
> [  4]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec  2504   5.66 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-60.00  sec  45.8 MBytes  6.40 Mbits/sec  14690
> sender
> [  4]   0.00-60.00  sec  45.7 MBytes  6.40 Mbits/sec
> receiver

That looks like TSO packets are being dropped and the traffic is basically TCP retransmissions of MTU size.

fbl


> 
> Server output:
> Accepted connection from 10.15.1.2, port 56464 [  5] local 10.15.1.3 
> port 5201 connected to 10.15.1.2 port 56466
> [ ID] Interval           Transfer     Bandwidth
> [  5]   0.00-10.00  sec  6.90 MBytes  5.79 Mbits/sec
> [  5]  10.00-20.00  sec  7.71 MBytes  6.47 Mbits/sec [  5]  
> 20.00-30.00  sec  7.73 MBytes  6.48 Mbits/sec [  5]  30.00-40.00  sec  
> 7.79 MBytes  6.53 Mbits/sec [  5]  40.00-50.00  sec  7.79 MBytes  6.53 
> Mbits/sec [  5]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec
> 
> 
> iperf Done.
> $
> 
> But it does work for tap, I'm not sure if it is a kernel issue, which 
> kernel version are you using? I didn't use tpacket_v3 patch. Here is 
> my local ovs info.
> 
> $ git log
> commit 1223cf123ed141c0a0110ebed17572bdb2e3d0f4
> Author: Ilya Maximets <i.maximets@ovn.org>
> Date:   Thu Feb 6 14:24:23 2020 +0100
> 
>     netdev-dpdk: Don't enable offloading on HW device if not requested.
> 
>     DPDK drivers has different implementations of transmit functions.
>     Enabled offloading may cause driver to choose slower variant
>     significantly affecting performance if userspace TSO wasn't requested.
> 
>     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
>     Reported-by: David Marchand <david.marchand@redhat.com>
>     Acked-by: David Marchand <david.marchand@redhat.com>
>     Acked-by: Flavio Leitner <fbl@sysclose.org>
>     Acked-by: Kevin Traynor <ktraynor@redhat.com>
>     Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> 
> commit 73858f9dbe83daf8cc8d4b604acc23eb62cc3f52
> Author: Flavio Leitner <fbl@sysclose.org>
> Date:   Mon Feb 3 18:45:50 2020 -0300
> 
>     netdev-linux: Prepend the std packet in the TSO packet
> 
>     Usually TSO packets are close to 50k, 60k bytes long, so to
>     to copy less bytes when receiving a packet from the kernel
>     change the approach. Instead of extending the MTU sized
>     packet received and append with remaining TSO data from
>     the TSO buffer, allocate a TSO packet with enough headroom
>     to prepend the std packet data.
> 
>     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
>     Suggested-by: Ben Pfaff <blp@ovn.org>
>     Signed-off-by: Flavio Leitner <fbl@sysclose.org>
>     Signed-off-by: Ben Pfaff <blp@ovn.org>
> 
> commit 2297cbe6cc25b6b1862c499ce8f16f52f75d9e5f
> Author: Flavio Leitner <fbl@sysclose.org>
> Date:   Mon Feb 3 11:22:22 2020 -0300
> 
>     netdev-linux-private: fix max length to be 16 bits
> 
>     The dp_packet length is limited to 16 bits, so document that
>     and fix the length value accordingly.
> 
>     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
>     Signed-off-by: Flavio Leitner <fbl@sysclose.org>
>     Signed-off-by: Ben Pfaff <blp@ovn.org>
> 
> commit 3d6a6f450af5b7eaf4b532983cb14458ae792b72
> Author: David Marchand <david.marchand@redhat.com>
> Date:   Tue Feb 4 22:28:26 2020 +0100
> 
>     netdev-dpdk: Fix port init when lacking Tx offloads for TSO.
> 
>     The check on TSO capability did not ensure ip checksum, tcp checksum and
>     TSO tx offloads were available which resulted in a port init failure
>     (example below with a ena device):
> 
>     *2020-02-04T17:42:52.976Z|00084|dpdk|ERR|Ethdev port_id=0 requested Tx
>     offloads 0x2a doesn't match Tx offloads capabilities 0xe in
>     rte_eth_dev_configure()*
> 
>     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload 
> support")
> 
>     Reported-by: Ravi Kerur <rkerur@gmail.com>
>     Signed-off-by: David Marchand <david.marchand@redhat.com>
>     Acked-by: Kevin Traynor <ktraynor@redhat.com>
>     Acked-by: Flavio Leitner <fbl@sysclose.org>
>     Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> 
> commit 8e371aa497aa95e3562d53f566c2d634b4b0f589
> Author: Kirill A. Kornilov <kirill@tp>
> Date:   Mon Jan 13 12:29:10 2020 +0300
> 
>     vswitchd: Add serial number configuration.
> 
>     Signed-off-by: Kirill A. Kornilov <kornilov@zelax.ru>
>     Signed-off-by: Ben Pfaff <blp@ovn.org>
> 
> I applied your tap patch.
> 
> $ git diff
> diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index 
> c6f3d27..74a5728 100644
> --- a/lib/netdev-linux.c
> +++ b/lib/netdev-linux.c
> @@ -1010,6 +1010,23 @@ netdev_linux_construct_tap(struct netdev *netdev_)
>          goto error_close;
>      }
> 
> +    if (userspace_tso_enabled()) {
> +        /* Old kernels don't support TUNSETOFFLOAD. If TUNSETOFFLOAD is
> +         * available, it will return EINVAL when a flag is unknown.
> +         * Therefore, try enabling offload with no flags to check
> +         * if TUNSETOFFLOAD support is available or not. */
> +        if (ioctl(netdev->tap_fd, TUNSETOFFLOAD, 0) == 0 || errno !=
> EINVAL) {
> +            unsigned long oflags = TUN_F_CSUM | TUN_F_TSO4 | 
> + TUN_F_TSO6;
> +
> +            if (ioctl(netdev->tap_fd, TUNSETOFFLOAD, oflags) == -1) {
> +                VLOG_WARN("%s: enabling tap offloading failed: %s", name,
> +                          ovs_strerror(errno));
> +                error = errno;
> +                goto error_close;
> +            }
> +        }
> +    }
> +
>      netdev->present = true;
>      return 0;
> 
> $
> 
> Here is performance result for tap.
> 
> $ ./run-iperf3.sh
> Connecting to host 10.15.1.3, port 5201 [  4] local 10.15.1.2 port 
> 56480 connected to 10.15.1.3 port 5201
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-10.00  sec  19.4 GBytes  16.7 Gbits/sec    0   3.05 MBytes
> [  4]  10.00-20.00  sec  18.3 GBytes  15.7 Gbits/sec    0   3.05 MBytes
> [  4]  20.00-30.00  sec  17.6 GBytes  15.1 Gbits/sec    0   3.05 MBytes
> [  4]  30.00-40.00  sec  19.3 GBytes  16.6 Gbits/sec    0   3.05 MBytes
> [  4]  40.00-50.00  sec  18.8 GBytes  16.1 Gbits/sec    0   3.05 MBytes
> [  4]  50.00-60.00  sec  17.9 GBytes  15.4 Gbits/sec    0   3.05 MBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-60.00  sec   111 GBytes  15.9 Gbits/sec    0             sender
> [  4]   0.00-60.00  sec   111 GBytes  15.9 Gbits/sec
> receiver
> 
> Server output:
> Accepted connection from 10.15.1.2, port 56478 [  5] local 10.15.1.3 
> port 5201 connected to 10.15.1.2 port 56480
> [ ID] Interval           Transfer     Bandwidth
> [  5]   0.00-10.00  sec  19.3 GBytes  16.6 Gbits/sec
> [  5]  10.00-20.00  sec  18.3 GBytes  15.7 Gbits/sec [  5]  
> 20.00-30.00  sec  17.6 GBytes  15.1 Gbits/sec [  5]  30.00-40.00  sec  
> 19.3 GBytes  16.6 Gbits/sec [  5]  40.00-50.00  sec  18.8 GBytes  16.1 
> Gbits/sec [  5]  50.00-60.00  sec  17.9 GBytes  15.4 Gbits/sec
> 
> 
> iperf Done.
> $



--
fbl
Yi Yang (杨燚)-云服务集团 Feb. 23, 2020, 3:04 a.m. UTC | #5
Hi, Flavio

After I ran it repeatedly in different servers, I'm very sure it can't work on Ubuntu 16.04 kernel 4.15.0-55-generic and Upstream kernel 4.15.9, so can you tell me your kernel version when you ran run-iperf3.sh I provided ? I doubt this TSO patch for veth needs higher kernel version.

-----邮件原件-----
发件人: Flavio Leitner [mailto:fbl@sysclose.org] 
发送时间: 2020年2月20日 21:41
收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
抄送: pkusunyifeng@gmail.com; dev@openvswitch.org; i.maximets@ovn.org; txfh2007@aliyun.com
主题: Re: 答复: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK

On Thu, Feb 20, 2020 at 10:10:36AM +0000, Yi Yang (杨 D)-云服务集团 wrote:
> Hi, Flavio
> 
> I find this tso feature doesn't work normally on my Ubuntu 16.04, here 
> is my result. My kernel version is
> 
> $ uname -a
> Linux cmp008 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4 
> 09:03:09 UTC
> 2019 x86_64 x86_64 x86_64 GNU/Linux
> $

I tested with 4.15.0 upstream and it worked. Can you do the same?

> $ ./run-iperf3.sh
> Connecting to host 10.15.1.3, port 5201 [  4] local 10.15.1.2 port 
> 56466 connected to 10.15.1.3 port 5201
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-10.00  sec  7.05 MBytes  5.91 Mbits/sec  2212   5.66 KBytes
> [  4]  10.00-20.00  sec  7.67 MBytes  6.44 Mbits/sec  2484   5.66 KBytes
> [  4]  20.00-30.00  sec  7.77 MBytes  6.52 Mbits/sec  2500   5.66 KBytes
> [  4]  30.00-40.00  sec  7.77 MBytes  6.52 Mbits/sec  2490   5.66 KBytes
> [  4]  40.00-50.00  sec  7.76 MBytes  6.51 Mbits/sec  2500   5.66 KBytes
> [  4]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec  2504   5.66 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-60.00  sec  45.8 MBytes  6.40 Mbits/sec  14690
> sender
> [  4]   0.00-60.00  sec  45.7 MBytes  6.40 Mbits/sec
> receiver

That looks like TSO packets are being dropped and the traffic is basically TCP retransmissions of MTU size.

fbl


> 
> Server output:
> Accepted connection from 10.15.1.2, port 56464 [  5] local 10.15.1.3 
> port 5201 connected to 10.15.1.2 port 56466
> [ ID] Interval           Transfer     Bandwidth
> [  5]   0.00-10.00  sec  6.90 MBytes  5.79 Mbits/sec
> [  5]  10.00-20.00  sec  7.71 MBytes  6.47 Mbits/sec [  5]  
> 20.00-30.00  sec  7.73 MBytes  6.48 Mbits/sec [  5]  30.00-40.00  sec  
> 7.79 MBytes  6.53 Mbits/sec [  5]  40.00-50.00  sec  7.79 MBytes  6.53 
> Mbits/sec [  5]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec
> 
> 
> iperf Done.
> $
> 
> But it does work for tap, I'm not sure if it is a kernel issue, which 
> kernel version are you using? I didn't use tpacket_v3 patch. Here is 
> my local ovs info.
> 
> $ git log
> commit 1223cf123ed141c0a0110ebed17572bdb2e3d0f4
> Author: Ilya Maximets <i.maximets@ovn.org>
> Date:   Thu Feb 6 14:24:23 2020 +0100
> 
>     netdev-dpdk: Don't enable offloading on HW device if not requested.
> 
>     DPDK drivers has different implementations of transmit functions.
>     Enabled offloading may cause driver to choose slower variant
>     significantly affecting performance if userspace TSO wasn't requested.
> 
>     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
>     Reported-by: David Marchand <david.marchand@redhat.com>
>     Acked-by: David Marchand <david.marchand@redhat.com>
>     Acked-by: Flavio Leitner <fbl@sysclose.org>
>     Acked-by: Kevin Traynor <ktraynor@redhat.com>
>     Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> 
> commit 73858f9dbe83daf8cc8d4b604acc23eb62cc3f52
> Author: Flavio Leitner <fbl@sysclose.org>
> Date:   Mon Feb 3 18:45:50 2020 -0300
> 
>     netdev-linux: Prepend the std packet in the TSO packet
> 
>     Usually TSO packets are close to 50k, 60k bytes long, so to
>     to copy less bytes when receiving a packet from the kernel
>     change the approach. Instead of extending the MTU sized
>     packet received and append with remaining TSO data from
>     the TSO buffer, allocate a TSO packet with enough headroom
>     to prepend the std packet data.
> 
>     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
>     Suggested-by: Ben Pfaff <blp@ovn.org>
>     Signed-off-by: Flavio Leitner <fbl@sysclose.org>
>     Signed-off-by: Ben Pfaff <blp@ovn.org>
> 
> commit 2297cbe6cc25b6b1862c499ce8f16f52f75d9e5f
> Author: Flavio Leitner <fbl@sysclose.org>
> Date:   Mon Feb 3 11:22:22 2020 -0300
> 
>     netdev-linux-private: fix max length to be 16 bits
> 
>     The dp_packet length is limited to 16 bits, so document that
>     and fix the length value accordingly.
> 
>     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
>     Signed-off-by: Flavio Leitner <fbl@sysclose.org>
>     Signed-off-by: Ben Pfaff <blp@ovn.org>
> 
> commit 3d6a6f450af5b7eaf4b532983cb14458ae792b72
> Author: David Marchand <david.marchand@redhat.com>
> Date:   Tue Feb 4 22:28:26 2020 +0100
> 
>     netdev-dpdk: Fix port init when lacking Tx offloads for TSO.
> 
>     The check on TSO capability did not ensure ip checksum, tcp checksum and
>     TSO tx offloads were available which resulted in a port init failure
>     (example below with a ena device):
> 
>     *2020-02-04T17:42:52.976Z|00084|dpdk|ERR|Ethdev port_id=0 requested Tx
>     offloads 0x2a doesn't match Tx offloads capabilities 0xe in
>     rte_eth_dev_configure()*
> 
>     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload 
> support")
> 
>     Reported-by: Ravi Kerur <rkerur@gmail.com>
>     Signed-off-by: David Marchand <david.marchand@redhat.com>
>     Acked-by: Kevin Traynor <ktraynor@redhat.com>
>     Acked-by: Flavio Leitner <fbl@sysclose.org>
>     Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> 
> commit 8e371aa497aa95e3562d53f566c2d634b4b0f589
> Author: Kirill A. Kornilov <kirill@tp>
> Date:   Mon Jan 13 12:29:10 2020 +0300
> 
>     vswitchd: Add serial number configuration.
> 
>     Signed-off-by: Kirill A. Kornilov <kornilov@zelax.ru>
>     Signed-off-by: Ben Pfaff <blp@ovn.org>
> 
> I applied your tap patch.
> 
> $ git diff
> diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index 
> c6f3d27..74a5728 100644
> --- a/lib/netdev-linux.c
> +++ b/lib/netdev-linux.c
> @@ -1010,6 +1010,23 @@ netdev_linux_construct_tap(struct netdev *netdev_)
>          goto error_close;
>      }
> 
> +    if (userspace_tso_enabled()) {
> +        /* Old kernels don't support TUNSETOFFLOAD. If TUNSETOFFLOAD is
> +         * available, it will return EINVAL when a flag is unknown.
> +         * Therefore, try enabling offload with no flags to check
> +         * if TUNSETOFFLOAD support is available or not. */
> +        if (ioctl(netdev->tap_fd, TUNSETOFFLOAD, 0) == 0 || errno !=
> EINVAL) {
> +            unsigned long oflags = TUN_F_CSUM | TUN_F_TSO4 | 
> + TUN_F_TSO6;
> +
> +            if (ioctl(netdev->tap_fd, TUNSETOFFLOAD, oflags) == -1) {
> +                VLOG_WARN("%s: enabling tap offloading failed: %s", name,
> +                          ovs_strerror(errno));
> +                error = errno;
> +                goto error_close;
> +            }
> +        }
> +    }
> +
>      netdev->present = true;
>      return 0;
> 
> $
> 
> Here is performance result for tap.
> 
> $ ./run-iperf3.sh
> Connecting to host 10.15.1.3, port 5201 [  4] local 10.15.1.2 port 
> 56480 connected to 10.15.1.3 port 5201
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-10.00  sec  19.4 GBytes  16.7 Gbits/sec    0   3.05 MBytes
> [  4]  10.00-20.00  sec  18.3 GBytes  15.7 Gbits/sec    0   3.05 MBytes
> [  4]  20.00-30.00  sec  17.6 GBytes  15.1 Gbits/sec    0   3.05 MBytes
> [  4]  30.00-40.00  sec  19.3 GBytes  16.6 Gbits/sec    0   3.05 MBytes
> [  4]  40.00-50.00  sec  18.8 GBytes  16.1 Gbits/sec    0   3.05 MBytes
> [  4]  50.00-60.00  sec  17.9 GBytes  15.4 Gbits/sec    0   3.05 MBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-60.00  sec   111 GBytes  15.9 Gbits/sec    0             sender
> [  4]   0.00-60.00  sec   111 GBytes  15.9 Gbits/sec
> receiver
> 
> Server output:
> Accepted connection from 10.15.1.2, port 56478 [  5] local 10.15.1.3 
> port 5201 connected to 10.15.1.2 port 56480
> [ ID] Interval           Transfer     Bandwidth
> [  5]   0.00-10.00  sec  19.3 GBytes  16.6 Gbits/sec
> [  5]  10.00-20.00  sec  18.3 GBytes  15.7 Gbits/sec [  5]  
> 20.00-30.00  sec  17.6 GBytes  15.1 Gbits/sec [  5]  30.00-40.00  sec  
> 19.3 GBytes  16.6 Gbits/sec [  5]  40.00-50.00  sec  18.8 GBytes  16.1 
> Gbits/sec [  5]  50.00-60.00  sec  17.9 GBytes  15.4 Gbits/sec
> 
> 
> iperf Done.
> $



--
fbl
Yi Yang (杨燚)-云服务集团 Feb. 23, 2020, 8:45 a.m. UTC | #6
Hi, Flavio

Just let you know, your TSO support patch does need higher kernel version, it will be great if you can add document to tell users which kernel version is minimal requirement. I can confirm it can work after I used Ubuntu 18.04 and use kernel 5.3.0-40-generic.

vagrant@ubuntu1804:~$ uname -a
Linux ubuntu1804 5.3.0-40-generic #32~18.04.1-Ubuntu SMP Mon Feb 3 14:05:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
vagrant@ubuntu1804:~$

By the way, TPACKET_V3 also can support TSO without any new change needed, so I think TPACKET_V3 can work normally with userspace-tso-enable=true as long as your TSO patch can work normally for veth-to-veth case.

I'll send out my tpacket patch v5 for review.

-----邮件原件-----
发件人: Yi Yang (杨燚)-云服务集团 
发送时间: 2020年2月23日 11:05
收件人: 'fbl@sysclose.org' <fbl@sysclose.org>
抄送: 'pkusunyifeng@gmail.com' <pkusunyifeng@gmail.com>; 'dev@openvswitch.org' <dev@openvswitch.org>; 'i.maximets@ovn.org' <i.maximets@ovn.org>; 'txfh2007@aliyun.com' <txfh2007@aliyun.com>
主题: 答复: 答复: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK
重要性: 高

Hi, Flavio

After I ran it repeatedly in different servers, I'm very sure it can't work on Ubuntu 16.04 kernel 4.15.0-55-generic and Upstream kernel 4.15.9, so can you tell me your kernel version when you ran run-iperf3.sh I provided ? I doubt this TSO patch for veth needs higher kernel version.

-----邮件原件-----
发件人: Flavio Leitner [mailto:fbl@sysclose.org] 
发送时间: 2020年2月20日 21:41
收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
抄送: pkusunyifeng@gmail.com; dev@openvswitch.org; i.maximets@ovn.org; txfh2007@aliyun.com
主题: Re: 答复: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK

On Thu, Feb 20, 2020 at 10:10:36AM +0000, Yi Yang (杨 D)-云服务集团 wrote:
> Hi, Flavio
> 
> I find this tso feature doesn't work normally on my Ubuntu 16.04, here 
> is my result. My kernel version is
> 
> $ uname -a
> Linux cmp008 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4 
> 09:03:09 UTC
> 2019 x86_64 x86_64 x86_64 GNU/Linux
> $

I tested with 4.15.0 upstream and it worked. Can you do the same?

> $ ./run-iperf3.sh
> Connecting to host 10.15.1.3, port 5201 [  4] local 10.15.1.2 port 
> 56466 connected to 10.15.1.3 port 5201
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-10.00  sec  7.05 MBytes  5.91 Mbits/sec  2212   5.66 KBytes
> [  4]  10.00-20.00  sec  7.67 MBytes  6.44 Mbits/sec  2484   5.66 KBytes
> [  4]  20.00-30.00  sec  7.77 MBytes  6.52 Mbits/sec  2500   5.66 KBytes
> [  4]  30.00-40.00  sec  7.77 MBytes  6.52 Mbits/sec  2490   5.66 KBytes
> [  4]  40.00-50.00  sec  7.76 MBytes  6.51 Mbits/sec  2500   5.66 KBytes
> [  4]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec  2504   5.66 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-60.00  sec  45.8 MBytes  6.40 Mbits/sec  14690
> sender
> [  4]   0.00-60.00  sec  45.7 MBytes  6.40 Mbits/sec
> receiver

That looks like TSO packets are being dropped and the traffic is basically TCP retransmissions of MTU size.

fbl


> 
> Server output:
> Accepted connection from 10.15.1.2, port 56464 [  5] local 10.15.1.3 
> port 5201 connected to 10.15.1.2 port 56466
> [ ID] Interval           Transfer     Bandwidth
> [  5]   0.00-10.00  sec  6.90 MBytes  5.79 Mbits/sec
> [  5]  10.00-20.00  sec  7.71 MBytes  6.47 Mbits/sec [  5]  
> 20.00-30.00  sec  7.73 MBytes  6.48 Mbits/sec [  5]  30.00-40.00  sec  
> 7.79 MBytes  6.53 Mbits/sec [  5]  40.00-50.00  sec  7.79 MBytes  6.53 
> Mbits/sec [  5]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec
> 
> 
> iperf Done.
> $
> 
> But it does work for tap, I'm not sure if it is a kernel issue, which 
> kernel version are you using? I didn't use tpacket_v3 patch. Here is 
> my local ovs info.
> 
> $ git log
> commit 1223cf123ed141c0a0110ebed17572bdb2e3d0f4
> Author: Ilya Maximets <i.maximets@ovn.org>
> Date:   Thu Feb 6 14:24:23 2020 +0100
> 
>     netdev-dpdk: Don't enable offloading on HW device if not requested.
> 
>     DPDK drivers has different implementations of transmit functions.
>     Enabled offloading may cause driver to choose slower variant
>     significantly affecting performance if userspace TSO wasn't requested.
> 
>     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
>     Reported-by: David Marchand <david.marchand@redhat.com>
>     Acked-by: David Marchand <david.marchand@redhat.com>
>     Acked-by: Flavio Leitner <fbl@sysclose.org>
>     Acked-by: Kevin Traynor <ktraynor@redhat.com>
>     Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> 
> commit 73858f9dbe83daf8cc8d4b604acc23eb62cc3f52
> Author: Flavio Leitner <fbl@sysclose.org>
> Date:   Mon Feb 3 18:45:50 2020 -0300
> 
>     netdev-linux: Prepend the std packet in the TSO packet
> 
>     Usually TSO packets are close to 50k, 60k bytes long, so to
>     to copy less bytes when receiving a packet from the kernel
>     change the approach. Instead of extending the MTU sized
>     packet received and append with remaining TSO data from
>     the TSO buffer, allocate a TSO packet with enough headroom
>     to prepend the std packet data.
> 
>     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
>     Suggested-by: Ben Pfaff <blp@ovn.org>
>     Signed-off-by: Flavio Leitner <fbl@sysclose.org>
>     Signed-off-by: Ben Pfaff <blp@ovn.org>
> 
> commit 2297cbe6cc25b6b1862c499ce8f16f52f75d9e5f
> Author: Flavio Leitner <fbl@sysclose.org>
> Date:   Mon Feb 3 11:22:22 2020 -0300
> 
>     netdev-linux-private: fix max length to be 16 bits
> 
>     The dp_packet length is limited to 16 bits, so document that
>     and fix the length value accordingly.
> 
>     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
>     Signed-off-by: Flavio Leitner <fbl@sysclose.org>
>     Signed-off-by: Ben Pfaff <blp@ovn.org>
> 
> commit 3d6a6f450af5b7eaf4b532983cb14458ae792b72
> Author: David Marchand <david.marchand@redhat.com>
> Date:   Tue Feb 4 22:28:26 2020 +0100
> 
>     netdev-dpdk: Fix port init when lacking Tx offloads for TSO.
> 
>     The check on TSO capability did not ensure ip checksum, tcp checksum and
>     TSO tx offloads were available which resulted in a port init failure
>     (example below with a ena device):
> 
>     *2020-02-04T17:42:52.976Z|00084|dpdk|ERR|Ethdev port_id=0 requested Tx
>     offloads 0x2a doesn't match Tx offloads capabilities 0xe in
>     rte_eth_dev_configure()*
> 
>     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload 
> support")
> 
>     Reported-by: Ravi Kerur <rkerur@gmail.com>
>     Signed-off-by: David Marchand <david.marchand@redhat.com>
>     Acked-by: Kevin Traynor <ktraynor@redhat.com>
>     Acked-by: Flavio Leitner <fbl@sysclose.org>
>     Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> 
> commit 8e371aa497aa95e3562d53f566c2d634b4b0f589
> Author: Kirill A. Kornilov <kirill@tp>
> Date:   Mon Jan 13 12:29:10 2020 +0300
> 
>     vswitchd: Add serial number configuration.
> 
>     Signed-off-by: Kirill A. Kornilov <kornilov@zelax.ru>
>     Signed-off-by: Ben Pfaff <blp@ovn.org>
> 
> I applied your tap patch.
> 
> $ git diff
> diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index 
> c6f3d27..74a5728 100644
> --- a/lib/netdev-linux.c
> +++ b/lib/netdev-linux.c
> @@ -1010,6 +1010,23 @@ netdev_linux_construct_tap(struct netdev *netdev_)
>          goto error_close;
>      }
> 
> +    if (userspace_tso_enabled()) {
> +        /* Old kernels don't support TUNSETOFFLOAD. If TUNSETOFFLOAD is
> +         * available, it will return EINVAL when a flag is unknown.
> +         * Therefore, try enabling offload with no flags to check
> +         * if TUNSETOFFLOAD support is available or not. */
> +        if (ioctl(netdev->tap_fd, TUNSETOFFLOAD, 0) == 0 || errno !=
> EINVAL) {
> +            unsigned long oflags = TUN_F_CSUM | TUN_F_TSO4 | 
> + TUN_F_TSO6;
> +
> +            if (ioctl(netdev->tap_fd, TUNSETOFFLOAD, oflags) == -1) {
> +                VLOG_WARN("%s: enabling tap offloading failed: %s", name,
> +                          ovs_strerror(errno));
> +                error = errno;
> +                goto error_close;
> +            }
> +        }
> +    }
> +
>      netdev->present = true;
>      return 0;
> 
> $
> 
> Here is performance result for tap.
> 
> $ ./run-iperf3.sh
> Connecting to host 10.15.1.3, port 5201 [  4] local 10.15.1.2 port 
> 56480 connected to 10.15.1.3 port 5201
> [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> [  4]   0.00-10.00  sec  19.4 GBytes  16.7 Gbits/sec    0   3.05 MBytes
> [  4]  10.00-20.00  sec  18.3 GBytes  15.7 Gbits/sec    0   3.05 MBytes
> [  4]  20.00-30.00  sec  17.6 GBytes  15.1 Gbits/sec    0   3.05 MBytes
> [  4]  30.00-40.00  sec  19.3 GBytes  16.6 Gbits/sec    0   3.05 MBytes
> [  4]  40.00-50.00  sec  18.8 GBytes  16.1 Gbits/sec    0   3.05 MBytes
> [  4]  50.00-60.00  sec  17.9 GBytes  15.4 Gbits/sec    0   3.05 MBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bandwidth       Retr
> [  4]   0.00-60.00  sec   111 GBytes  15.9 Gbits/sec    0             sender
> [  4]   0.00-60.00  sec   111 GBytes  15.9 Gbits/sec
> receiver
> 
> Server output:
> Accepted connection from 10.15.1.2, port 56478 [  5] local 10.15.1.3 
> port 5201 connected to 10.15.1.2 port 56480
> [ ID] Interval           Transfer     Bandwidth
> [  5]   0.00-10.00  sec  19.3 GBytes  16.6 Gbits/sec
> [  5]  10.00-20.00  sec  18.3 GBytes  15.7 Gbits/sec [  5]  
> 20.00-30.00  sec  17.6 GBytes  15.1 Gbits/sec [  5]  30.00-40.00  sec  
> 19.3 GBytes  16.6 Gbits/sec [  5]  40.00-50.00  sec  18.8 GBytes  16.1 
> Gbits/sec [  5]  50.00-60.00  sec  17.9 GBytes  15.4 Gbits/sec
> 
> 
> iperf Done.
> $



--
fbl
Flavio Leitner Feb. 28, 2020, 1:42 p.m. UTC | #7
On Sun, Feb 23, 2020 at 08:45:16AM +0000, Yi Yang (杨燚)-云服务集团 wrote:
> Hi, Flavio
> 
> Just let you know, your TSO support patch does need higher kernel version, it will be great if you can add document to tell users which kernel version is minimal requirement. I can confirm it can work after I used Ubuntu 18.04 and use kernel 5.3.0-40-generic.
> 
> vagrant@ubuntu1804:~$ uname -a
> Linux ubuntu1804 5.3.0-40-generic #32~18.04.1-Ubuntu SMP Mon Feb 3 14:05:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
> vagrant@ubuntu1804:~$

OK, I was able to reproduce the bad tput with veth-ovs-veth with
TSO on 4.15.0 upstream and I am going to dig deeper now.

Thanks,
fbl

> 
> By the way, TPACKET_V3 also can support TSO without any new change needed, so I think TPACKET_V3 can work normally with userspace-tso-enable=true as long as your TSO patch can work normally for veth-to-veth case.
> 
> I'll send out my tpacket patch v5 for review.
> 
> -----邮件原件-----
> 发件人: Yi Yang (杨燚)-云服务集团 
> 发送时间: 2020年2月23日 11:05
> 收件人: 'fbl@sysclose.org' <fbl@sysclose.org>
> 抄送: 'pkusunyifeng@gmail.com' <pkusunyifeng@gmail.com>; 'dev@openvswitch.org' <dev@openvswitch.org>; 'i.maximets@ovn.org' <i.maximets@ovn.org>; 'txfh2007@aliyun.com' <txfh2007@aliyun.com>
> 主题: 答复: 答复: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK
> 重要性: 高
> 
> Hi, Flavio
> 
> After I ran it repeatedly in different servers, I'm very sure it can't work on Ubuntu 16.04 kernel 4.15.0-55-generic and Upstream kernel 4.15.9, so can you tell me your kernel version when you ran run-iperf3.sh I provided ? I doubt this TSO patch for veth needs higher kernel version.
> 
> -----邮件原件-----
> 发件人: Flavio Leitner [mailto:fbl@sysclose.org] 
> 发送时间: 2020年2月20日 21:41
> 收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
> 抄送: pkusunyifeng@gmail.com; dev@openvswitch.org; i.maximets@ovn.org; txfh2007@aliyun.com
> 主题: Re: 答复: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK
> 
> On Thu, Feb 20, 2020 at 10:10:36AM +0000, Yi Yang (杨 D)-云服务集团 wrote:
> > Hi, Flavio
> > 
> > I find this tso feature doesn't work normally on my Ubuntu 16.04, here 
> > is my result. My kernel version is
> > 
> > $ uname -a
> > Linux cmp008 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4 
> > 09:03:09 UTC
> > 2019 x86_64 x86_64 x86_64 GNU/Linux
> > $
> 
> I tested with 4.15.0 upstream and it worked. Can you do the same?
> 
> > $ ./run-iperf3.sh
> > Connecting to host 10.15.1.3, port 5201 [  4] local 10.15.1.2 port 
> > 56466 connected to 10.15.1.3 port 5201
> > [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> > [  4]   0.00-10.00  sec  7.05 MBytes  5.91 Mbits/sec  2212   5.66 KBytes
> > [  4]  10.00-20.00  sec  7.67 MBytes  6.44 Mbits/sec  2484   5.66 KBytes
> > [  4]  20.00-30.00  sec  7.77 MBytes  6.52 Mbits/sec  2500   5.66 KBytes
> > [  4]  30.00-40.00  sec  7.77 MBytes  6.52 Mbits/sec  2490   5.66 KBytes
> > [  4]  40.00-50.00  sec  7.76 MBytes  6.51 Mbits/sec  2500   5.66 KBytes
> > [  4]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec  2504   5.66 KBytes
> > - - - - - - - - - - - - - - - - - - - - - - - - -
> > [ ID] Interval           Transfer     Bandwidth       Retr
> > [  4]   0.00-60.00  sec  45.8 MBytes  6.40 Mbits/sec  14690
> > sender
> > [  4]   0.00-60.00  sec  45.7 MBytes  6.40 Mbits/sec
> > receiver
> 
> That looks like TSO packets are being dropped and the traffic is basically TCP retransmissions of MTU size.
> 
> fbl
> 
> 
> > 
> > Server output:
> > Accepted connection from 10.15.1.2, port 56464 [  5] local 10.15.1.3 
> > port 5201 connected to 10.15.1.2 port 56466
> > [ ID] Interval           Transfer     Bandwidth
> > [  5]   0.00-10.00  sec  6.90 MBytes  5.79 Mbits/sec
> > [  5]  10.00-20.00  sec  7.71 MBytes  6.47 Mbits/sec [  5]  
> > 20.00-30.00  sec  7.73 MBytes  6.48 Mbits/sec [  5]  30.00-40.00  sec  
> > 7.79 MBytes  6.53 Mbits/sec [  5]  40.00-50.00  sec  7.79 MBytes  6.53 
> > Mbits/sec [  5]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec
> > 
> > 
> > iperf Done.
> > $
> > 
> > But it does work for tap, I'm not sure if it is a kernel issue, which 
> > kernel version are you using? I didn't use tpacket_v3 patch. Here is 
> > my local ovs info.
> > 
> > $ git log
> > commit 1223cf123ed141c0a0110ebed17572bdb2e3d0f4
> > Author: Ilya Maximets <i.maximets@ovn.org>
> > Date:   Thu Feb 6 14:24:23 2020 +0100
> > 
> >     netdev-dpdk: Don't enable offloading on HW device if not requested.
> > 
> >     DPDK drivers has different implementations of transmit functions.
> >     Enabled offloading may cause driver to choose slower variant
> >     significantly affecting performance if userspace TSO wasn't requested.
> > 
> >     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
> >     Reported-by: David Marchand <david.marchand@redhat.com>
> >     Acked-by: David Marchand <david.marchand@redhat.com>
> >     Acked-by: Flavio Leitner <fbl@sysclose.org>
> >     Acked-by: Kevin Traynor <ktraynor@redhat.com>
> >     Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> > 
> > commit 73858f9dbe83daf8cc8d4b604acc23eb62cc3f52
> > Author: Flavio Leitner <fbl@sysclose.org>
> > Date:   Mon Feb 3 18:45:50 2020 -0300
> > 
> >     netdev-linux: Prepend the std packet in the TSO packet
> > 
> >     Usually TSO packets are close to 50k, 60k bytes long, so to
> >     to copy less bytes when receiving a packet from the kernel
> >     change the approach. Instead of extending the MTU sized
> >     packet received and append with remaining TSO data from
> >     the TSO buffer, allocate a TSO packet with enough headroom
> >     to prepend the std packet data.
> > 
> >     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
> >     Suggested-by: Ben Pfaff <blp@ovn.org>
> >     Signed-off-by: Flavio Leitner <fbl@sysclose.org>
> >     Signed-off-by: Ben Pfaff <blp@ovn.org>
> > 
> > commit 2297cbe6cc25b6b1862c499ce8f16f52f75d9e5f
> > Author: Flavio Leitner <fbl@sysclose.org>
> > Date:   Mon Feb 3 11:22:22 2020 -0300
> > 
> >     netdev-linux-private: fix max length to be 16 bits
> > 
> >     The dp_packet length is limited to 16 bits, so document that
> >     and fix the length value accordingly.
> > 
> >     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
> >     Signed-off-by: Flavio Leitner <fbl@sysclose.org>
> >     Signed-off-by: Ben Pfaff <blp@ovn.org>
> > 
> > commit 3d6a6f450af5b7eaf4b532983cb14458ae792b72
> > Author: David Marchand <david.marchand@redhat.com>
> > Date:   Tue Feb 4 22:28:26 2020 +0100
> > 
> >     netdev-dpdk: Fix port init when lacking Tx offloads for TSO.
> > 
> >     The check on TSO capability did not ensure ip checksum, tcp checksum and
> >     TSO tx offloads were available which resulted in a port init failure
> >     (example below with a ena device):
> > 
> >     *2020-02-04T17:42:52.976Z|00084|dpdk|ERR|Ethdev port_id=0 requested Tx
> >     offloads 0x2a doesn't match Tx offloads capabilities 0xe in
> >     rte_eth_dev_configure()*
> > 
> >     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload 
> > support")
> > 
> >     Reported-by: Ravi Kerur <rkerur@gmail.com>
> >     Signed-off-by: David Marchand <david.marchand@redhat.com>
> >     Acked-by: Kevin Traynor <ktraynor@redhat.com>
> >     Acked-by: Flavio Leitner <fbl@sysclose.org>
> >     Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> > 
> > commit 8e371aa497aa95e3562d53f566c2d634b4b0f589
> > Author: Kirill A. Kornilov <kirill@tp>
> > Date:   Mon Jan 13 12:29:10 2020 +0300
> > 
> >     vswitchd: Add serial number configuration.
> > 
> >     Signed-off-by: Kirill A. Kornilov <kornilov@zelax.ru>
> >     Signed-off-by: Ben Pfaff <blp@ovn.org>
> > 
> > I applied your tap patch.
> > 
> > $ git diff
> > diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index 
> > c6f3d27..74a5728 100644
> > --- a/lib/netdev-linux.c
> > +++ b/lib/netdev-linux.c
> > @@ -1010,6 +1010,23 @@ netdev_linux_construct_tap(struct netdev *netdev_)
> >          goto error_close;
> >      }
> > 
> > +    if (userspace_tso_enabled()) {
> > +        /* Old kernels don't support TUNSETOFFLOAD. If TUNSETOFFLOAD is
> > +         * available, it will return EINVAL when a flag is unknown.
> > +         * Therefore, try enabling offload with no flags to check
> > +         * if TUNSETOFFLOAD support is available or not. */
> > +        if (ioctl(netdev->tap_fd, TUNSETOFFLOAD, 0) == 0 || errno !=
> > EINVAL) {
> > +            unsigned long oflags = TUN_F_CSUM | TUN_F_TSO4 | 
> > + TUN_F_TSO6;
> > +
> > +            if (ioctl(netdev->tap_fd, TUNSETOFFLOAD, oflags) == -1) {
> > +                VLOG_WARN("%s: enabling tap offloading failed: %s", name,
> > +                          ovs_strerror(errno));
> > +                error = errno;
> > +                goto error_close;
> > +            }
> > +        }
> > +    }
> > +
> >      netdev->present = true;
> >      return 0;
> > 
> > $
> > 
> > Here is performance result for tap.
> > 
> > $ ./run-iperf3.sh
> > Connecting to host 10.15.1.3, port 5201 [  4] local 10.15.1.2 port 
> > 56480 connected to 10.15.1.3 port 5201
> > [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> > [  4]   0.00-10.00  sec  19.4 GBytes  16.7 Gbits/sec    0   3.05 MBytes
> > [  4]  10.00-20.00  sec  18.3 GBytes  15.7 Gbits/sec    0   3.05 MBytes
> > [  4]  20.00-30.00  sec  17.6 GBytes  15.1 Gbits/sec    0   3.05 MBytes
> > [  4]  30.00-40.00  sec  19.3 GBytes  16.6 Gbits/sec    0   3.05 MBytes
> > [  4]  40.00-50.00  sec  18.8 GBytes  16.1 Gbits/sec    0   3.05 MBytes
> > [  4]  50.00-60.00  sec  17.9 GBytes  15.4 Gbits/sec    0   3.05 MBytes
> > - - - - - - - - - - - - - - - - - - - - - - - - -
> > [ ID] Interval           Transfer     Bandwidth       Retr
> > [  4]   0.00-60.00  sec   111 GBytes  15.9 Gbits/sec    0             sender
> > [  4]   0.00-60.00  sec   111 GBytes  15.9 Gbits/sec
> > receiver
> > 
> > Server output:
> > Accepted connection from 10.15.1.2, port 56478 [  5] local 10.15.1.3 
> > port 5201 connected to 10.15.1.2 port 56480
> > [ ID] Interval           Transfer     Bandwidth
> > [  5]   0.00-10.00  sec  19.3 GBytes  16.6 Gbits/sec
> > [  5]  10.00-20.00  sec  18.3 GBytes  15.7 Gbits/sec [  5]  
> > 20.00-30.00  sec  17.6 GBytes  15.1 Gbits/sec [  5]  30.00-40.00  sec  
> > 19.3 GBytes  16.6 Gbits/sec [  5]  40.00-50.00  sec  18.8 GBytes  16.1 
> > Gbits/sec [  5]  50.00-60.00  sec  17.9 GBytes  15.4 Gbits/sec
> > 
> > 
> > iperf Done.
> > $
> 
> 
> 
> --
> fbl
Flavio Leitner Feb. 28, 2020, 5:56 p.m. UTC | #8
Hi Yi Yang,

This is the bug fix required to make veth TSO work in OvS:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9d2f67e43b73e8af7438be219b66a5de0cfa8bd9

commit 9d2f67e43b73e8af7438be219b66a5de0cfa8bd9
Author: Jianfeng Tan <jianfeng.tan@linux.alibaba.com>
Date:   Sat Sep 29 15:41:27 2018 +0000

    net/packet: fix packet drop as of virtio gso
    
    When we use raw socket as the vhost backend, a packet from virito with
    gso offloading information, cannot be sent out in later validaton at
    xmit path, as we did not set correct skb->protocol which is further used
    for looking up the gso function.
    
    To fix this, we set this field according to virito hdr information.
    
    Fixes: e858fae2b0b8f4 ("virtio_net: use common code for virtio_net_hdr and skb GSO conversion")
    Signed-off-by: Jianfeng Tan <jianfeng.tan@linux.alibaba.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>


So, the minimum kernel version is 4.19.

fbl

On Sun, Feb 23, 2020 at 08:45:16AM +0000, Yi Yang (杨燚)-云服务集团 wrote:
> Hi, Flavio
> 
> Just let you know, your TSO support patch does need higher kernel version, it will be great if you can add document to tell users which kernel version is minimal requirement. I can confirm it can work after I used Ubuntu 18.04 and use kernel 5.3.0-40-generic.
> 
> vagrant@ubuntu1804:~$ uname -a
> Linux ubuntu1804 5.3.0-40-generic #32~18.04.1-Ubuntu SMP Mon Feb 3 14:05:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
> vagrant@ubuntu1804:~$
> 
> By the way, TPACKET_V3 also can support TSO without any new change needed, so I think TPACKET_V3 can work normally with userspace-tso-enable=true as long as your TSO patch can work normally for veth-to-veth case.
> 
> I'll send out my tpacket patch v5 for review.
> 
> -----邮件原件-----
> 发件人: Yi Yang (杨燚)-云服务集团 
> 发送时间: 2020年2月23日 11:05
> 收件人: 'fbl@sysclose.org' <fbl@sysclose.org>
> 抄送: 'pkusunyifeng@gmail.com' <pkusunyifeng@gmail.com>; 'dev@openvswitch.org' <dev@openvswitch.org>; 'i.maximets@ovn.org' <i.maximets@ovn.org>; 'txfh2007@aliyun.com' <txfh2007@aliyun.com>
> 主题: 答复: 答复: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK
> 重要性: 高
> 
> Hi, Flavio
> 
> After I ran it repeatedly in different servers, I'm very sure it can't work on Ubuntu 16.04 kernel 4.15.0-55-generic and Upstream kernel 4.15.9, so can you tell me your kernel version when you ran run-iperf3.sh I provided ? I doubt this TSO patch for veth needs higher kernel version.
> 
> -----邮件原件-----
> 发件人: Flavio Leitner [mailto:fbl@sysclose.org] 
> 发送时间: 2020年2月20日 21:41
> 收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
> 抄送: pkusunyifeng@gmail.com; dev@openvswitch.org; i.maximets@ovn.org; txfh2007@aliyun.com
> 主题: Re: 答复: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK
> 
> On Thu, Feb 20, 2020 at 10:10:36AM +0000, Yi Yang (杨 D)-云服务集团 wrote:
> > Hi, Flavio
> > 
> > I find this tso feature doesn't work normally on my Ubuntu 16.04, here 
> > is my result. My kernel version is
> > 
> > $ uname -a
> > Linux cmp008 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4 
> > 09:03:09 UTC
> > 2019 x86_64 x86_64 x86_64 GNU/Linux
> > $
> 
> I tested with 4.15.0 upstream and it worked. Can you do the same?
> 
> > $ ./run-iperf3.sh
> > Connecting to host 10.15.1.3, port 5201 [  4] local 10.15.1.2 port 
> > 56466 connected to 10.15.1.3 port 5201
> > [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> > [  4]   0.00-10.00  sec  7.05 MBytes  5.91 Mbits/sec  2212   5.66 KBytes
> > [  4]  10.00-20.00  sec  7.67 MBytes  6.44 Mbits/sec  2484   5.66 KBytes
> > [  4]  20.00-30.00  sec  7.77 MBytes  6.52 Mbits/sec  2500   5.66 KBytes
> > [  4]  30.00-40.00  sec  7.77 MBytes  6.52 Mbits/sec  2490   5.66 KBytes
> > [  4]  40.00-50.00  sec  7.76 MBytes  6.51 Mbits/sec  2500   5.66 KBytes
> > [  4]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec  2504   5.66 KBytes
> > - - - - - - - - - - - - - - - - - - - - - - - - -
> > [ ID] Interval           Transfer     Bandwidth       Retr
> > [  4]   0.00-60.00  sec  45.8 MBytes  6.40 Mbits/sec  14690
> > sender
> > [  4]   0.00-60.00  sec  45.7 MBytes  6.40 Mbits/sec
> > receiver
> 
> That looks like TSO packets are being dropped and the traffic is basically TCP retransmissions of MTU size.
> 
> fbl
> 
> 
> > 
> > Server output:
> > Accepted connection from 10.15.1.2, port 56464 [  5] local 10.15.1.3 
> > port 5201 connected to 10.15.1.2 port 56466
> > [ ID] Interval           Transfer     Bandwidth
> > [  5]   0.00-10.00  sec  6.90 MBytes  5.79 Mbits/sec
> > [  5]  10.00-20.00  sec  7.71 MBytes  6.47 Mbits/sec [  5]  
> > 20.00-30.00  sec  7.73 MBytes  6.48 Mbits/sec [  5]  30.00-40.00  sec  
> > 7.79 MBytes  6.53 Mbits/sec [  5]  40.00-50.00  sec  7.79 MBytes  6.53 
> > Mbits/sec [  5]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec
> > 
> > 
> > iperf Done.
> > $
> > 
> > But it does work for tap, I'm not sure if it is a kernel issue, which 
> > kernel version are you using? I didn't use tpacket_v3 patch. Here is 
> > my local ovs info.
> > 
> > $ git log
> > commit 1223cf123ed141c0a0110ebed17572bdb2e3d0f4
> > Author: Ilya Maximets <i.maximets@ovn.org>
> > Date:   Thu Feb 6 14:24:23 2020 +0100
> > 
> >     netdev-dpdk: Don't enable offloading on HW device if not requested.
> > 
> >     DPDK drivers has different implementations of transmit functions.
> >     Enabled offloading may cause driver to choose slower variant
> >     significantly affecting performance if userspace TSO wasn't requested.
> > 
> >     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
> >     Reported-by: David Marchand <david.marchand@redhat.com>
> >     Acked-by: David Marchand <david.marchand@redhat.com>
> >     Acked-by: Flavio Leitner <fbl@sysclose.org>
> >     Acked-by: Kevin Traynor <ktraynor@redhat.com>
> >     Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> > 
> > commit 73858f9dbe83daf8cc8d4b604acc23eb62cc3f52
> > Author: Flavio Leitner <fbl@sysclose.org>
> > Date:   Mon Feb 3 18:45:50 2020 -0300
> > 
> >     netdev-linux: Prepend the std packet in the TSO packet
> > 
> >     Usually TSO packets are close to 50k, 60k bytes long, so to
> >     to copy less bytes when receiving a packet from the kernel
> >     change the approach. Instead of extending the MTU sized
> >     packet received and append with remaining TSO data from
> >     the TSO buffer, allocate a TSO packet with enough headroom
> >     to prepend the std packet data.
> > 
> >     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
> >     Suggested-by: Ben Pfaff <blp@ovn.org>
> >     Signed-off-by: Flavio Leitner <fbl@sysclose.org>
> >     Signed-off-by: Ben Pfaff <blp@ovn.org>
> > 
> > commit 2297cbe6cc25b6b1862c499ce8f16f52f75d9e5f
> > Author: Flavio Leitner <fbl@sysclose.org>
> > Date:   Mon Feb 3 11:22:22 2020 -0300
> > 
> >     netdev-linux-private: fix max length to be 16 bits
> > 
> >     The dp_packet length is limited to 16 bits, so document that
> >     and fix the length value accordingly.
> > 
> >     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
> >     Signed-off-by: Flavio Leitner <fbl@sysclose.org>
> >     Signed-off-by: Ben Pfaff <blp@ovn.org>
> > 
> > commit 3d6a6f450af5b7eaf4b532983cb14458ae792b72
> > Author: David Marchand <david.marchand@redhat.com>
> > Date:   Tue Feb 4 22:28:26 2020 +0100
> > 
> >     netdev-dpdk: Fix port init when lacking Tx offloads for TSO.
> > 
> >     The check on TSO capability did not ensure ip checksum, tcp checksum and
> >     TSO tx offloads were available which resulted in a port init failure
> >     (example below with a ena device):
> > 
> >     *2020-02-04T17:42:52.976Z|00084|dpdk|ERR|Ethdev port_id=0 requested Tx
> >     offloads 0x2a doesn't match Tx offloads capabilities 0xe in
> >     rte_eth_dev_configure()*
> > 
> >     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload 
> > support")
> > 
> >     Reported-by: Ravi Kerur <rkerur@gmail.com>
> >     Signed-off-by: David Marchand <david.marchand@redhat.com>
> >     Acked-by: Kevin Traynor <ktraynor@redhat.com>
> >     Acked-by: Flavio Leitner <fbl@sysclose.org>
> >     Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> > 
> > commit 8e371aa497aa95e3562d53f566c2d634b4b0f589
> > Author: Kirill A. Kornilov <kirill@tp>
> > Date:   Mon Jan 13 12:29:10 2020 +0300
> > 
> >     vswitchd: Add serial number configuration.
> > 
> >     Signed-off-by: Kirill A. Kornilov <kornilov@zelax.ru>
> >     Signed-off-by: Ben Pfaff <blp@ovn.org>
> > 
> > I applied your tap patch.
> > 
> > $ git diff
> > diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index 
> > c6f3d27..74a5728 100644
> > --- a/lib/netdev-linux.c
> > +++ b/lib/netdev-linux.c
> > @@ -1010,6 +1010,23 @@ netdev_linux_construct_tap(struct netdev *netdev_)
> >          goto error_close;
> >      }
> > 
> > +    if (userspace_tso_enabled()) {
> > +        /* Old kernels don't support TUNSETOFFLOAD. If TUNSETOFFLOAD is
> > +         * available, it will return EINVAL when a flag is unknown.
> > +         * Therefore, try enabling offload with no flags to check
> > +         * if TUNSETOFFLOAD support is available or not. */
> > +        if (ioctl(netdev->tap_fd, TUNSETOFFLOAD, 0) == 0 || errno !=
> > EINVAL) {
> > +            unsigned long oflags = TUN_F_CSUM | TUN_F_TSO4 | 
> > + TUN_F_TSO6;
> > +
> > +            if (ioctl(netdev->tap_fd, TUNSETOFFLOAD, oflags) == -1) {
> > +                VLOG_WARN("%s: enabling tap offloading failed: %s", name,
> > +                          ovs_strerror(errno));
> > +                error = errno;
> > +                goto error_close;
> > +            }
> > +        }
> > +    }
> > +
> >      netdev->present = true;
> >      return 0;
> > 
> > $
> > 
> > Here is performance result for tap.
> > 
> > $ ./run-iperf3.sh
> > Connecting to host 10.15.1.3, port 5201 [  4] local 10.15.1.2 port 
> > 56480 connected to 10.15.1.3 port 5201
> > [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> > [  4]   0.00-10.00  sec  19.4 GBytes  16.7 Gbits/sec    0   3.05 MBytes
> > [  4]  10.00-20.00  sec  18.3 GBytes  15.7 Gbits/sec    0   3.05 MBytes
> > [  4]  20.00-30.00  sec  17.6 GBytes  15.1 Gbits/sec    0   3.05 MBytes
> > [  4]  30.00-40.00  sec  19.3 GBytes  16.6 Gbits/sec    0   3.05 MBytes
> > [  4]  40.00-50.00  sec  18.8 GBytes  16.1 Gbits/sec    0   3.05 MBytes
> > [  4]  50.00-60.00  sec  17.9 GBytes  15.4 Gbits/sec    0   3.05 MBytes
> > - - - - - - - - - - - - - - - - - - - - - - - - -
> > [ ID] Interval           Transfer     Bandwidth       Retr
> > [  4]   0.00-60.00  sec   111 GBytes  15.9 Gbits/sec    0             sender
> > [  4]   0.00-60.00  sec   111 GBytes  15.9 Gbits/sec
> > receiver
> > 
> > Server output:
> > Accepted connection from 10.15.1.2, port 56478 [  5] local 10.15.1.3 
> > port 5201 connected to 10.15.1.2 port 56480
> > [ ID] Interval           Transfer     Bandwidth
> > [  5]   0.00-10.00  sec  19.3 GBytes  16.6 Gbits/sec
> > [  5]  10.00-20.00  sec  18.3 GBytes  15.7 Gbits/sec [  5]  
> > 20.00-30.00  sec  17.6 GBytes  15.1 Gbits/sec [  5]  30.00-40.00  sec  
> > 19.3 GBytes  16.6 Gbits/sec [  5]  40.00-50.00  sec  18.8 GBytes  16.1 
> > Gbits/sec [  5]  50.00-60.00  sec  17.9 GBytes  15.4 Gbits/sec
> > 
> > 
> > iperf Done.
> > $
> 
> 
> 
> --
> fbl
Yi Yang (杨燚)-云服务集团 March 1, 2020, 2:58 a.m. UTC | #9
Flavio, got it, thanks a lot. By the way, about too many retransmissions issue for veth, I did further investigation, I doubt it is also a veth-related bug on kernel side, maybe you Redhat kernel guys can help fix it, tap interface doesn't have this issue and it has super high performance compared to veth interface.

-----邮件原件-----
发件人: Flavio Leitner [mailto:fbl@sysclose.org] 
发送时间: 2020年2月29日 1:56
收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
抄送: pkusunyifeng@gmail.com; dev@openvswitch.org; i.maximets@ovn.org; txfh2007@aliyun.com
主题: Re: 答复: 答复: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK


Hi Yi Yang,

This is the bug fix required to make veth TSO work in OvS:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9d2f67e43b73e8af7438be219b66a5de0cfa8bd9

commit 9d2f67e43b73e8af7438be219b66a5de0cfa8bd9
Author: Jianfeng Tan <jianfeng.tan@linux.alibaba.com>
Date:   Sat Sep 29 15:41:27 2018 +0000

    net/packet: fix packet drop as of virtio gso
    
    When we use raw socket as the vhost backend, a packet from virito with
    gso offloading information, cannot be sent out in later validaton at
    xmit path, as we did not set correct skb->protocol which is further used
    for looking up the gso function.
    
    To fix this, we set this field according to virito hdr information.
    
    Fixes: e858fae2b0b8f4 ("virtio_net: use common code for virtio_net_hdr and skb GSO conversion")
    Signed-off-by: Jianfeng Tan <jianfeng.tan@linux.alibaba.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>


So, the minimum kernel version is 4.19.

fbl

On Sun, Feb 23, 2020 at 08:45:16AM +0000, Yi Yang (杨燚)-云服务集团 wrote:
> Hi, Flavio
> 
> Just let you know, your TSO support patch does need higher kernel version, it will be great if you can add document to tell users which kernel version is minimal requirement. I can confirm it can work after I used Ubuntu 18.04 and use kernel 5.3.0-40-generic.
> 
> vagrant@ubuntu1804:~$ uname -a
> Linux ubuntu1804 5.3.0-40-generic #32~18.04.1-Ubuntu SMP Mon Feb 3 
> 14:05:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux vagrant@ubuntu1804:~$
> 
> By the way, TPACKET_V3 also can support TSO without any new change needed, so I think TPACKET_V3 can work normally with userspace-tso-enable=true as long as your TSO patch can work normally for veth-to-veth case.
> 
> I'll send out my tpacket patch v5 for review.
> 
> -----邮件原件-----
> 发件人: Yi Yang (杨燚)-云服务集团
> 发送时间: 2020年2月23日 11:05
> 收件人: 'fbl@sysclose.org' <fbl@sysclose.org>
> 抄送: 'pkusunyifeng@gmail.com' <pkusunyifeng@gmail.com>; 
> 'dev@openvswitch.org' <dev@openvswitch.org>; 'i.maximets@ovn.org' 
> <i.maximets@ovn.org>; 'txfh2007@aliyun.com' <txfh2007@aliyun.com>
> 主题: 答复: 答复: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK
> 重要性: 高
> 
> Hi, Flavio
> 
> After I ran it repeatedly in different servers, I'm very sure it can't work on Ubuntu 16.04 kernel 4.15.0-55-generic and Upstream kernel 4.15.9, so can you tell me your kernel version when you ran run-iperf3.sh I provided ? I doubt this TSO patch for veth needs higher kernel version.
> 
> -----邮件原件-----
> 发件人: Flavio Leitner [mailto:fbl@sysclose.org]
> 发送时间: 2020年2月20日 21:41
> 收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>
> 抄送: pkusunyifeng@gmail.com; dev@openvswitch.org; i.maximets@ovn.org; 
> txfh2007@aliyun.com
> 主题: Re: 答复: [ovs-dev] [PATCH v4 0/3] Add support for TSO with DPDK
> 
> On Thu, Feb 20, 2020 at 10:10:36AM +0000, Yi Yang (杨 D)-云服务集团 wrote:
> > Hi, Flavio
> > 
> > I find this tso feature doesn't work normally on my Ubuntu 16.04, 
> > here is my result. My kernel version is
> > 
> > $ uname -a
> > Linux cmp008 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4
> > 09:03:09 UTC
> > 2019 x86_64 x86_64 x86_64 GNU/Linux
> > $
> 
> I tested with 4.15.0 upstream and it worked. Can you do the same?
> 
> > $ ./run-iperf3.sh
> > Connecting to host 10.15.1.3, port 5201 [  4] local 10.15.1.2 port
> > 56466 connected to 10.15.1.3 port 5201
> > [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> > [  4]   0.00-10.00  sec  7.05 MBytes  5.91 Mbits/sec  2212   5.66 KBytes
> > [  4]  10.00-20.00  sec  7.67 MBytes  6.44 Mbits/sec  2484   5.66 KBytes
> > [  4]  20.00-30.00  sec  7.77 MBytes  6.52 Mbits/sec  2500   5.66 KBytes
> > [  4]  30.00-40.00  sec  7.77 MBytes  6.52 Mbits/sec  2490   5.66 KBytes
> > [  4]  40.00-50.00  sec  7.76 MBytes  6.51 Mbits/sec  2500   5.66 KBytes
> > [  4]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec  2504   5.66 KBytes
> > - - - - - - - - - - - - - - - - - - - - - - - - -
> > [ ID] Interval           Transfer     Bandwidth       Retr
> > [  4]   0.00-60.00  sec  45.8 MBytes  6.40 Mbits/sec  14690
> > sender
> > [  4]   0.00-60.00  sec  45.7 MBytes  6.40 Mbits/sec
> > receiver
> 
> That looks like TSO packets are being dropped and the traffic is basically TCP retransmissions of MTU size.
> 
> fbl
> 
> 
> > 
> > Server output:
> > Accepted connection from 10.15.1.2, port 56464 [  5] local 10.15.1.3 
> > port 5201 connected to 10.15.1.2 port 56466
> > [ ID] Interval           Transfer     Bandwidth
> > [  5]   0.00-10.00  sec  6.90 MBytes  5.79 Mbits/sec
> > [  5]  10.00-20.00  sec  7.71 MBytes  6.47 Mbits/sec [  5]
> > 20.00-30.00  sec  7.73 MBytes  6.48 Mbits/sec [  5]  30.00-40.00  
> > sec
> > 7.79 MBytes  6.53 Mbits/sec [  5]  40.00-50.00  sec  7.79 MBytes  
> > 6.53 Mbits/sec [  5]  50.00-60.00  sec  7.79 MBytes  6.54 Mbits/sec
> > 
> > 
> > iperf Done.
> > $
> > 
> > But it does work for tap, I'm not sure if it is a kernel issue, 
> > which kernel version are you using? I didn't use tpacket_v3 patch. 
> > Here is my local ovs info.
> > 
> > $ git log
> > commit 1223cf123ed141c0a0110ebed17572bdb2e3d0f4
> > Author: Ilya Maximets <i.maximets@ovn.org>
> > Date:   Thu Feb 6 14:24:23 2020 +0100
> > 
> >     netdev-dpdk: Don't enable offloading on HW device if not requested.
> > 
> >     DPDK drivers has different implementations of transmit functions.
> >     Enabled offloading may cause driver to choose slower variant
> >     significantly affecting performance if userspace TSO wasn't requested.
> > 
> >     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
> >     Reported-by: David Marchand <david.marchand@redhat.com>
> >     Acked-by: David Marchand <david.marchand@redhat.com>
> >     Acked-by: Flavio Leitner <fbl@sysclose.org>
> >     Acked-by: Kevin Traynor <ktraynor@redhat.com>
> >     Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> > 
> > commit 73858f9dbe83daf8cc8d4b604acc23eb62cc3f52
> > Author: Flavio Leitner <fbl@sysclose.org>
> > Date:   Mon Feb 3 18:45:50 2020 -0300
> > 
> >     netdev-linux: Prepend the std packet in the TSO packet
> > 
> >     Usually TSO packets are close to 50k, 60k bytes long, so to
> >     to copy less bytes when receiving a packet from the kernel
> >     change the approach. Instead of extending the MTU sized
> >     packet received and append with remaining TSO data from
> >     the TSO buffer, allocate a TSO packet with enough headroom
> >     to prepend the std packet data.
> > 
> >     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
> >     Suggested-by: Ben Pfaff <blp@ovn.org>
> >     Signed-off-by: Flavio Leitner <fbl@sysclose.org>
> >     Signed-off-by: Ben Pfaff <blp@ovn.org>
> > 
> > commit 2297cbe6cc25b6b1862c499ce8f16f52f75d9e5f
> > Author: Flavio Leitner <fbl@sysclose.org>
> > Date:   Mon Feb 3 11:22:22 2020 -0300
> > 
> >     netdev-linux-private: fix max length to be 16 bits
> > 
> >     The dp_packet length is limited to 16 bits, so document that
> >     and fix the length value accordingly.
> > 
> >     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support")
> >     Signed-off-by: Flavio Leitner <fbl@sysclose.org>
> >     Signed-off-by: Ben Pfaff <blp@ovn.org>
> > 
> > commit 3d6a6f450af5b7eaf4b532983cb14458ae792b72
> > Author: David Marchand <david.marchand@redhat.com>
> > Date:   Tue Feb 4 22:28:26 2020 +0100
> > 
> >     netdev-dpdk: Fix port init when lacking Tx offloads for TSO.
> > 
> >     The check on TSO capability did not ensure ip checksum, tcp checksum and
> >     TSO tx offloads were available which resulted in a port init failure
> >     (example below with a ena device):
> > 
> >     *2020-02-04T17:42:52.976Z|00084|dpdk|ERR|Ethdev port_id=0 requested Tx
> >     offloads 0x2a doesn't match Tx offloads capabilities 0xe in
> >     rte_eth_dev_configure()*
> > 
> >     Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload
> > support")
> > 
> >     Reported-by: Ravi Kerur <rkerur@gmail.com>
> >     Signed-off-by: David Marchand <david.marchand@redhat.com>
> >     Acked-by: Kevin Traynor <ktraynor@redhat.com>
> >     Acked-by: Flavio Leitner <fbl@sysclose.org>
> >     Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> > 
> > commit 8e371aa497aa95e3562d53f566c2d634b4b0f589
> > Author: Kirill A. Kornilov <kirill@tp>
> > Date:   Mon Jan 13 12:29:10 2020 +0300
> > 
> >     vswitchd: Add serial number configuration.
> > 
> >     Signed-off-by: Kirill A. Kornilov <kornilov@zelax.ru>
> >     Signed-off-by: Ben Pfaff <blp@ovn.org>
> > 
> > I applied your tap patch.
> > 
> > $ git diff
> > diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index
> > c6f3d27..74a5728 100644
> > --- a/lib/netdev-linux.c
> > +++ b/lib/netdev-linux.c
> > @@ -1010,6 +1010,23 @@ netdev_linux_construct_tap(struct netdev *netdev_)
> >          goto error_close;
> >      }
> > 
> > +    if (userspace_tso_enabled()) {
> > +        /* Old kernels don't support TUNSETOFFLOAD. If TUNSETOFFLOAD is
> > +         * available, it will return EINVAL when a flag is unknown.
> > +         * Therefore, try enabling offload with no flags to check
> > +         * if TUNSETOFFLOAD support is available or not. */
> > +        if (ioctl(netdev->tap_fd, TUNSETOFFLOAD, 0) == 0 || errno 
> > + !=
> > EINVAL) {
> > +            unsigned long oflags = TUN_F_CSUM | TUN_F_TSO4 | 
> > + TUN_F_TSO6;
> > +
> > +            if (ioctl(netdev->tap_fd, TUNSETOFFLOAD, oflags) == -1) {
> > +                VLOG_WARN("%s: enabling tap offloading failed: %s", name,
> > +                          ovs_strerror(errno));
> > +                error = errno;
> > +                goto error_close;
> > +            }
> > +        }
> > +    }
> > +
> >      netdev->present = true;
> >      return 0;
> > 
> > $
> > 
> > Here is performance result for tap.
> > 
> > $ ./run-iperf3.sh
> > Connecting to host 10.15.1.3, port 5201 [  4] local 10.15.1.2 port
> > 56480 connected to 10.15.1.3 port 5201
> > [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> > [  4]   0.00-10.00  sec  19.4 GBytes  16.7 Gbits/sec    0   3.05 MBytes
> > [  4]  10.00-20.00  sec  18.3 GBytes  15.7 Gbits/sec    0   3.05 MBytes
> > [  4]  20.00-30.00  sec  17.6 GBytes  15.1 Gbits/sec    0   3.05 MBytes
> > [  4]  30.00-40.00  sec  19.3 GBytes  16.6 Gbits/sec    0   3.05 MBytes
> > [  4]  40.00-50.00  sec  18.8 GBytes  16.1 Gbits/sec    0   3.05 MBytes
> > [  4]  50.00-60.00  sec  17.9 GBytes  15.4 Gbits/sec    0   3.05 MBytes
> > - - - - - - - - - - - - - - - - - - - - - - - - -
> > [ ID] Interval           Transfer     Bandwidth       Retr
> > [  4]   0.00-60.00  sec   111 GBytes  15.9 Gbits/sec    0             sender
> > [  4]   0.00-60.00  sec   111 GBytes  15.9 Gbits/sec
> > receiver
> > 
> > Server output:
> > Accepted connection from 10.15.1.2, port 56478 [  5] local 10.15.1.3 
> > port 5201 connected to 10.15.1.2 port 56480
> > [ ID] Interval           Transfer     Bandwidth
> > [  5]   0.00-10.00  sec  19.3 GBytes  16.6 Gbits/sec
> > [  5]  10.00-20.00  sec  18.3 GBytes  15.7 Gbits/sec [  5]
> > 20.00-30.00  sec  17.6 GBytes  15.1 Gbits/sec [  5]  30.00-40.00  
> > sec
> > 19.3 GBytes  16.6 Gbits/sec [  5]  40.00-50.00  sec  18.8 GBytes  
> > 16.1 Gbits/sec [  5]  50.00-60.00  sec  17.9 GBytes  15.4 Gbits/sec
> > 
> > 
> > iperf Done.
> > $
> 
> 
> 
> --
> fbl



--
fbl
William Tu March 2, 2020, 7:08 p.m. UTC | #10
On Fri, Feb 28, 2020 at 9:56 AM Flavio Leitner <fbl@sysclose.org> wrote:
>
>
> Hi Yi Yang,
>
> This is the bug fix required to make veth TSO work in OvS:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9d2f67e43b73e8af7438be219b66a5de0cfa8bd9
>
> commit 9d2f67e43b73e8af7438be219b66a5de0cfa8bd9
> Author: Jianfeng Tan <jianfeng.tan@linux.alibaba.com>
> Date:   Sat Sep 29 15:41:27 2018 +0000
>
>     net/packet: fix packet drop as of virtio gso
>
>     When we use raw socket as the vhost backend, a packet from virito with
>     gso offloading information, cannot be sent out in later validaton at
>     xmit path, as we did not set correct skb->protocol which is further used
>     for looking up the gso function.
>
>     To fix this, we set this field according to virito hdr information.
>
>     Fixes: e858fae2b0b8f4 ("virtio_net: use common code for virtio_net_hdr and skb GSO conversion")
>     Signed-off-by: Jianfeng Tan <jianfeng.tan@linux.alibaba.com>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
>
>
> So, the minimum kernel version is 4.19.
>
Thanks,
I sent a patch to update the documentation. Please take a look.
William

Patch
diff mbox series

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index c6f3d27..74a5728 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -1010,6 +1010,23 @@  netdev_linux_construct_tap(struct netdev *netdev_)
         goto error_close;
     }

+    if (userspace_tso_enabled()) {
+        /* Old kernels don't support TUNSETOFFLOAD. If TUNSETOFFLOAD is
+         * available, it will return EINVAL when a flag is unknown.
+         * Therefore, try enabling offload with no flags to check
+         * if TUNSETOFFLOAD support is available or not. */
+        if (ioctl(netdev->tap_fd, TUNSETOFFLOAD, 0) == 0 || errno !=
EINVAL) {
+            unsigned long oflags = TUN_F_CSUM | TUN_F_TSO4 | TUN_F_TSO6;
+
+            if (ioctl(netdev->tap_fd, TUNSETOFFLOAD, oflags) == -1) {
+                VLOG_WARN("%s: enabling tap offloading failed: %s", name,
+                          ovs_strerror(errno));
+                error = errno;
+                goto error_close;
+            }
+        }
+    }
+
     netdev->present = true;
     return 0;