diff mbox series

[ovs-dev,v2,3/3] netdev-dpdk: Enable TSO when using multi-seg mbufs

Message ID 1547139522-31154-4-git-send-email-tiago.lam@intel.com
State Changes Requested
Delegated to: Ian Stokes
Headers show
Series dpdk: Add support for TSO | expand

Commit Message

Lam, Tiago Jan. 10, 2019, 4:58 p.m. UTC
TCP Segmentation Offload (TSO) is a feature which enables the TCP/IP
network stack to delegate segmentation of a TCP segment to the hardware
NIC, thus saving compute resources. This may improve performance
significantly for TCP workload in virtualized environments.

While a previous commit already added the necesary logic to netdev-dpdk
to deal with packets marked for TSO, this set of changes enables TSO by
default when using multi-segment mbufs.

Thus, to enable TSO on the physical DPDK interfaces, only the following
command needs to be issued before starting OvS:
    ovs-vsctl set Open_vSwitch . other_config:dpdk-multi-seg-mbufs=true

Co-authored-by: Mark Kavanagh <mark.b.kavanagh@intel.com>

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Tiago Lam <tiago.lam@intel.com>
---
 Documentation/automake.mk           |   1 +
 Documentation/topics/dpdk/index.rst |   1 +
 Documentation/topics/dpdk/tso.rst   | 111 ++++++++++++++++++++++++++++++++++++
 NEWS                                |   1 +
 lib/netdev-dpdk.c                   |  63 +++++++++++++++++---
 5 files changed, 168 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/topics/dpdk/tso.rst

Comments

Stokes, Ian Jan. 11, 2019, 4:02 p.m. UTC | #1
On 1/10/2019 4:58 PM, Tiago Lam wrote:
> TCP Segmentation Offload (TSO) is a feature which enables the TCP/IP
> network stack to delegate segmentation of a TCP segment to the hardware
> NIC, thus saving compute resources. This may improve performance
> significantly for TCP workload in virtualized environments.
> 
> While a previous commit already added the necesary logic to netdev-dpdk
> to deal with packets marked for TSO, this set of changes enables TSO by
> default when using multi-segment mbufs.
> 
> Thus, to enable TSO on the physical DPDK interfaces, only the following
> command needs to be issued before starting OvS:
>      ovs-vsctl set Open_vSwitch . other_config:dpdk-multi-seg-mbufs=true

A general question (and not a blocker by any means), is it more 
intuitive to change above to be 'other_config:dpdk-tso=true' or 
something similar?

I'm thinking from a user perspective, they would be familiar with the 
concept of TSO already but not necessarily multi segments. It seems to 
be a technical per-requisite for DPDK (and OVS DPDK) under the hood.

But again multi segments will be needed for other offloads besides TSO 
so maybe it makes more sense to stick with the argument you have already.

> 
> Co-authored-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> 
> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
> Signed-off-by: Tiago Lam <tiago.lam@intel.com>
> ---
>   Documentation/automake.mk           |   1 +
>   Documentation/topics/dpdk/index.rst |   1 +
>   Documentation/topics/dpdk/tso.rst   | 111 ++++++++++++++++++++++++++++++++++++
>   NEWS                                |   1 +
>   lib/netdev-dpdk.c                   |  63 +++++++++++++++++---
>   5 files changed, 168 insertions(+), 9 deletions(-)
>   create mode 100644 Documentation/topics/dpdk/tso.rst
> 
> diff --git a/Documentation/automake.mk b/Documentation/automake.mk
> index 082438e..a20deb8 100644
> --- a/Documentation/automake.mk
> +++ b/Documentation/automake.mk
> @@ -39,6 +39,7 @@ DOC_SOURCE = \
>   	Documentation/topics/dpdk/index.rst \
>   	Documentation/topics/dpdk/bridge.rst \
>   	Documentation/topics/dpdk/jumbo-frames.rst \
> +	Documentation/topics/dpdk/tso.rst \
>   	Documentation/topics/dpdk/memory.rst \
>   	Documentation/topics/dpdk/pdump.rst \
>   	Documentation/topics/dpdk/phy.rst \
> diff --git a/Documentation/topics/dpdk/index.rst b/Documentation/topics/dpdk/index.rst
> index cf24a7b..eb2a04d 100644
> --- a/Documentation/topics/dpdk/index.rst
> +++ b/Documentation/topics/dpdk/index.rst
> @@ -40,4 +40,5 @@ The DPDK Datapath
>      /topics/dpdk/qos
>      /topics/dpdk/pdump
>      /topics/dpdk/jumbo-frames
> +   /topics/dpdk/tso
>      /topics/dpdk/memory
> diff --git a/Documentation/topics/dpdk/tso.rst b/Documentation/topics/dpdk/tso.rst
> new file mode 100644
> index 0000000..503354f
> --- /dev/null
> +++ b/Documentation/topics/dpdk/tso.rst
> @@ -0,0 +1,111 @@
> +..
> +      Copyright 2018, Red Hat, Inc.

Probably Intel in this case rather than RH.

> +
> +      Licensed under the Apache License, Version 2.0 (the "License"); you may
> +      not use this file except in compliance with the License. You may obtain
> +      a copy of the License at
> +
> +          http://www.apache.org/licenses/LICENSE-2.0
> +
> +      Unless required by applicable law or agreed to in writing, software
> +      distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
> +      WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
> +      License for the specific language governing permissions and limitations
> +      under the License.
> +
> +      Convention for heading levels in Open vSwitch documentation:
> +
> +      =======  Heading 0 (reserved for the title in a document)
> +      -------  Heading 1
> +      ~~~~~~~  Heading 2
> +      +++++++  Heading 3
> +      '''''''  Heading 4
> +
> +      Avoid deeper levels because they do not render well.
> +
> +===
> +TSO
> +===
You should probably add a notice that this feature is (experimental).

> +
> +.. versionadded:: 2.11.0
It's good practice to add the version added but traditionally we've done 
this after the fact once the release is out. Other wise you can have 
commits on master referring to a release that is not officially out yet.

This has come up before and in that case we waited for the release 
before adding.
> +
> +TCP Segmentation Offload (TSO) is a mechanism which allows a TCP/IP stack to
> +offload the TCP segmentation into hardware, thus saving the cycles that would
> +be required to perform this same segmentation in Software.
Minor: Is software capitalized for a reason above and throughout the 
text below? I would have thought lower case would be fine.

> +
> +TCP Segmentation Offload (TSO) enables a network stack to delegate segmentation
> +of an oversized TCP segment to the underlying physical NIC. Offload of frame
> +segmentation achieves computational savings in the core, freeing up CPU cycles
> +for more useful work.
> +
> +A common use case for TSO is when using virtualization, where traffic that's
> +coming in from a VM can offload the TCP segmentation, thus avoiding the
> +fragmentation in Software. Additionally, if the traffic is headed to a VM
> +within the same host further optimization can be expected. As the traffic never
> +leaves the machine, no MTU needs to be accounted for, and thus no segmentation
> +and checksum calculations are required, which saves yet more cycles. Only when
> +the traffic actually leaves the host the segmentation needs to happen, in which
> +case it will be performed by the egress NIC.
> +
> +When using TSO with DPDK, the implementation relies on the multi-segment mbufs
> +feature, described in :doc:`/topics/dpdk/jumbo-frames`, where each mbuf
> +contains ~2KiB of the entire packet's data and is linked to the next mbuf that
> +contains the next portion of data.
> +
> +Enabling TSO
> +~~~~~~~~~~~~
> +Once multi-segment mbufs is enabled, TSO will be enabled by default, if there's
> +support for it in the underlying physical NICs attached to OvS-DPDK.
> +

Probably worth calling out above with an Important: tag.

> +When using :doc:`vHost User ports <vhost-user>`, TSO may be enabled in one of
> +two ways, as follows.
> +
> +`TSO` is enabled in OvS by the DPDK vHost User backend; when a new guest
> +connection is established, `TSO` is thus advertised to the guest as an
> +available feature:
> +
> +  1. QEMU Command Line Parameter:
> +     ```
> +     sudo $QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 \
> +     ...
> +     -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,\
> +     mrg_rxbuf=on,csum=on,gso=on,guest_csum=on,guest_tso4=on,\
> +     guest_tso6=on,guest_ecn=on \
> +     ...
> +     ```

Your passing gso above as well as tso, is that required? I wasn't sure 
myself.

> +
> +  2. ethtool
> +
> +Assuming that the guest's OS also supports `TSO`, ethtool can be used to enable
> +same:
> +
> +    ```
> +    ethtool -K eth0 sg on     # scatter-gather is a prerequisite for TSO
> +    ethtool -K eth0 tso on
> +    ethtool -k eth0
> +    ```
> +
> +    <b>Note:</b> In both methods, `mergeable buffers` are required:
Would call out above as important rather than note.

> +    ```
> +    sudo $QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 \
> +    ...
> +    mrg_rxbuf=on,\
> +    ...
> +    ```
> +
> +To enable TSO in a guest, the underlying NIC must first support `TSO` - consult
> +your controller's datasheet for compatibility. Secondly, the NIC must have an
> +associated DPDK Poll Mode Driver (PMD) which supports `TSO`.
> +
> +~~~~~~~~~~~
> +Limitations
> +~~~~~~~~~~~
> +The current OvS `TSO` implementation supports flat and VLAN networks only (i.e.
> +no support for `TSO` over tunneled connection [VxLAN, GRE, IPinIP, etc.]).
> +
> +Also, as TSO is built on top of multi-segments mbufs, the constraints pointed
> +out in :doc:`/topics/dpdk/jumbo-frames` also apply for TSO. Thus, some
> +performance hits might be noticed when running specific functionalitry, like
Minor typo in 'functionality' above.

> +the Userspace Connection tracker. And as mentioned in the same section, it is
> +paramount that a packet's headers is contained within the first mbuf (~2KiB in
> +size).
> diff --git a/NEWS b/NEWS
> index 98f5a9b..c7cfd17 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -23,6 +23,7 @@ Post-v2.10.0
>        * Add option for simple round-robin based Rxq to PMD assignment.
>          It can be set with pmd-rxq-assign.
>        * Add support for DPDK 18.11
> +     * Add support for TSO (between DPDK interfaces only).

can you add the experimental tag next to this (there's a few examples in 
the news doc).

>      - Add 'symmetric_l3' hash function.
>      - OVS now honors 'updelay' and 'downdelay' for bonds with LACP configured.
>      - ovs-vswitchd:
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index ad7223a..7af37ee 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -374,7 +374,8 @@ struct ingress_policer {
>   enum dpdk_hw_ol_features {
>       NETDEV_RX_CHECKSUM_OFFLOAD = 1 << 0,
>       NETDEV_RX_HW_CRC_STRIP = 1 << 1,
> -    NETDEV_RX_HW_SCATTER = 1 << 2
> +    NETDEV_RX_HW_SCATTER = 1 << 2,
> +    NETDEV_TX_TSO_OFFLOAD = 1 << 3,
>   };
>   
>   /*
> @@ -1005,19 +1006,26 @@ dpdk_eth_dev_port_config(struct netdev_dpdk *dev, int n_rxq, int n_txq)
>       /* Multi-segment-mbuf-specific setup. */
>       if (dpdk_multi_segment_mbufs) {
>           if (info.tx_offload_capa & DEV_TX_OFFLOAD_MULTI_SEGS) {
> -            /* Enable multi-seg mbufs. DPDK PMDs typically attempt to use
> -             * simple or vectorized transmit functions, neither of which are
> -             * compatible with multi-segment mbufs. */
> +            /* DPDK PMDs typically attempt to use simple or vectorized
> +             * transmit functions, neither of which are compatible with
> +             * multi-segment mbufs. Ensure that these are disabled when
> +             * multi-segment mbufs are enabled.
> +             */
I would think this comment should be added when the conf.txmode.offloads 
|= DEV_TX_OFFLOAD_MULTI_SEGS; cod below was introduced rather than at 
this stage?
>               conf.txmode.offloads |= DEV_TX_OFFLOAD_MULTI_SEGS;
> -        } else {
> -            VLOG_WARN("Interface %s doesn't support multi-segment mbufs",
> -                      dev->up.name);
> -            conf.txmode.offloads &= ~DEV_TX_OFFLOAD_MULTI_SEGS;
Although it's removed here, I believe we discussed that rather than a 
warning the device should fail to configure if MSEG is enabled but the 
device doesn't support?
> +        }
> +
> +        if (dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD) {
> +            conf.txmode.offloads |= DEV_TX_OFFLOAD_TCP_TSO;
> +            conf.txmode.offloads |= DEV_TX_OFFLOAD_TCP_CKSUM;
> +            conf.txmode.offloads |= DEV_TX_OFFLOAD_IPV4_CKSUM;
>           }
>   
>           txconf = info.default_txconf;
> -        //txconf.txq_flags = ETH_TXQ_FLAGS_IGNORE;
I guess this is left over from previous dev work you had, should be 
removed from this patch for the next revision.

>           txconf.offloads = conf.txmode.offloads;
> +    } else if (dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD) {
> +        dev->hw_ol_features &= ~NETDEV_TX_TSO_OFFLOAD;
> +        VLOG_WARN("Failed to set Tx TSO offload in %s. Requires option "
> +                  "`dpdk-multi-seg-mbufs` to be enabled.", dev->up.name);
>       }
>   
>       conf.intr_conf.lsc = dev->lsc_interrupt_mode;
> @@ -1134,6 +1142,9 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
>       uint32_t rx_chksm_offload_capa = DEV_RX_OFFLOAD_UDP_CKSUM |
>                                        DEV_RX_OFFLOAD_TCP_CKSUM |
>                                        DEV_RX_OFFLOAD_IPV4_CKSUM;
> +    uint32_t tx_tso_offload_capa = DEV_TX_OFFLOAD_TCP_TSO |
> +                                   DEV_TX_OFFLOAD_TCP_CKSUM |
> +                                   DEV_TX_OFFLOAD_IPV4_CKSUM;
>   
>       rte_eth_dev_info_get(dev->port_id, &info);
>   
> @@ -1160,6 +1171,18 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
>           dev->hw_ol_features &= ~NETDEV_RX_HW_SCATTER;
>       }
>   
> +    if (dpdk_multi_segment_mbufs) {
> +        if (info.tx_offload_capa & tx_tso_offload_capa) {
> +            dev->hw_ol_features |= NETDEV_TX_TSO_OFFLOAD;
> +        } else {
> +            dev->hw_ol_features &= ~NETDEV_TX_TSO_OFFLOAD;
> +            VLOG_WARN("Tx TSO offload is not supported on port "
> +                      DPDK_PORT_ID_FMT, dev->port_id);
> +        }
> +    } else {
> +        dev->hw_ol_features &= ~NETDEV_TX_TSO_OFFLOAD;
> +    }
> +
>       n_rxq = MIN(info.max_rx_queues, dev->up.n_rxq);
>       n_txq = MIN(info.max_tx_queues, dev->up.n_txq);
>   
> @@ -1684,6 +1707,11 @@ netdev_dpdk_get_config(const struct netdev *netdev, struct smap *args)
>           } else {
>               smap_add(args, "rx_csum_offload", "false");
>           }
> +        if (dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD) {
> +            smap_add(args, "tx_tso_offload", "true");
> +        } else {
> +            smap_add(args, "tx_tso_offload", "false");
> +        }
>           smap_add(args, "lsc_interrupt_mode",
>                    dev->lsc_interrupt_mode ? "true" : "false");
>       }
> @@ -2372,6 +2400,15 @@ netdev_dpdk_filter_packet_len(struct netdev_dpdk *dev, struct rte_mbuf **pkts,
>       for (i = 0; i < pkt_cnt; i++) {
>           pkt = pkts[i];
>   
> +        /* Drop TSO packet if there's no TSO support on egress port */
> +        if ((pkt->ol_flags & PKT_TX_TCP_SEG) &&
> +            !(dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD)) {
> +            VLOG_WARN_RL(&rl, "%s: TSO is disabled on port, TSO packet dropped"
> +                        "%" PRIu32 " ", dev->up.name, pkt->pkt_len);
> +            rte_pktmbuf_free(pkt);
> +            continue;
> +        }
Hmmm, have we changed the nature of the function now. where as before it 
was just checking packet length now we inspect for offload type as well.

It could be better to rename the function 'netdev_dpdk_filter_packet' 
and add an explanation in the comment preceding the function WRT what 
filters are applied (length and TSO offload support).

Ian
> +
>           if (OVS_UNLIKELY(pkt->pkt_len > dev->max_packet_len)) {
>               if (!(pkt->ol_flags & PKT_TX_TCP_SEG)) {
>                   VLOG_WARN_RL(&rl, "%s: Too big size %" PRIu32 " "
> @@ -4245,6 +4282,14 @@ dpdk_vhost_reconfigure_helper(struct netdev_dpdk *dev)
>           dev->tx_q[0].map = 0;
>       }
>   
> +    if (dpdk_multi_segment_mbufs) {
> +        dev->hw_ol_features |= NETDEV_TX_TSO_OFFLOAD;
> +
> +        VLOG_DBG("%s: TSO enabled on vhost port", dev->up.name);
> +    } else {
> +        dev->hw_ol_features &= ~NETDEV_TX_TSO_OFFLOAD;
> +    }
> +
>       netdev_dpdk_remap_txqs(dev);
>   
>       err = netdev_dpdk_mempool_configure(dev);
>
Lam, Tiago Jan. 12, 2019, 7:31 p.m. UTC | #2
On 11/01/2019 16:02, Ian Stokes wrote:
> On 1/10/2019 4:58 PM, Tiago Lam wrote:
>> TCP Segmentation Offload (TSO) is a feature which enables the TCP/IP
>> network stack to delegate segmentation of a TCP segment to the hardware
>> NIC, thus saving compute resources. This may improve performance
>> significantly for TCP workload in virtualized environments.
>>
>> While a previous commit already added the necesary logic to netdev-dpdk
>> to deal with packets marked for TSO, this set of changes enables TSO by
>> default when using multi-segment mbufs.
>>
>> Thus, to enable TSO on the physical DPDK interfaces, only the following
>> command needs to be issued before starting OvS:
>>      ovs-vsctl set Open_vSwitch . other_config:dpdk-multi-seg-mbufs=true
> 
> A general question (and not a blocker by any means), is it more 
> intuitive to change above to be 'other_config:dpdk-tso=true' or 
> something similar?
> 
> I'm thinking from a user perspective, they would be familiar with the 
> concept of TSO already but not necessarily multi segments. It seems to 
> be a technical per-requisite for DPDK (and OVS DPDK) under the hood.
> 
> But again multi segments will be needed for other offloads besides TSO 
> so maybe it makes more sense to stick with the argument you have already.

That was my first thought, and the  main reason why I went with
`dpdk-multi-seg-mbufs`. But I'm open to suggestions, as mentioned in the
first RFC of the series. If people think it's better to go with
`dpdk-tso` for now, I'd gladly change.

Thanks for the comments. I've replied to some in-line, but assume I'll
deal with ones I haven't comment as well.

> 
>>
>> Co-authored-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>>
>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
>> Signed-off-by: Tiago Lam <tiago.lam@intel.com>
>> ---
>>   Documentation/automake.mk           |   1 +
>>   Documentation/topics/dpdk/index.rst |   1 +
>>   Documentation/topics/dpdk/tso.rst   | 111 ++++++++++++++++++++++++++++++++++++
>>   NEWS                                |   1 +
>>   lib/netdev-dpdk.c                   |  63 +++++++++++++++++---
>>   5 files changed, 168 insertions(+), 9 deletions(-)
>>   create mode 100644 Documentation/topics/dpdk/tso.rst
>>
>> diff --git a/Documentation/automake.mk b/Documentation/automake.mk
>> index 082438e..a20deb8 100644
>> --- a/Documentation/automake.mk
>> +++ b/Documentation/automake.mk
>> @@ -39,6 +39,7 @@ DOC_SOURCE = \
>>   	Documentation/topics/dpdk/index.rst \
>>   	Documentation/topics/dpdk/bridge.rst \
>>   	Documentation/topics/dpdk/jumbo-frames.rst \
>> +	Documentation/topics/dpdk/tso.rst \
>>   	Documentation/topics/dpdk/memory.rst \
>>   	Documentation/topics/dpdk/pdump.rst \
>>   	Documentation/topics/dpdk/phy.rst \
>> diff --git a/Documentation/topics/dpdk/index.rst b/Documentation/topics/dpdk/index.rst
>> index cf24a7b..eb2a04d 100644
>> --- a/Documentation/topics/dpdk/index.rst
>> +++ b/Documentation/topics/dpdk/index.rst
>> @@ -40,4 +40,5 @@ The DPDK Datapath
>>      /topics/dpdk/qos
>>      /topics/dpdk/pdump
>>      /topics/dpdk/jumbo-frames
>> +   /topics/dpdk/tso
>>      /topics/dpdk/memory
>> diff --git a/Documentation/topics/dpdk/tso.rst b/Documentation/topics/dpdk/tso.rst
>> new file mode 100644
>> index 0000000..503354f
>> --- /dev/null
>> +++ b/Documentation/topics/dpdk/tso.rst
>> @@ -0,0 +1,111 @@
>> +..
>> +      Copyright 2018, Red Hat, Inc.
> 
> Probably Intel in this case rather than RH.

:)

> 
>> +
>> +      Licensed under the Apache License, Version 2.0 (the "License"); you may
>> +      not use this file except in compliance with the License. You may obtain
>> +      a copy of the License at
>> +
>> +          http://www.apache.org/licenses/LICENSE-2.0
>> +
>> +      Unless required by applicable law or agreed to in writing, software
>> +      distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
>> +      WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
>> +      License for the specific language governing permissions and limitations
>> +      under the License.
>> +
>> +      Convention for heading levels in Open vSwitch documentation:
>> +
>> +      =======  Heading 0 (reserved for the title in a document)
>> +      -------  Heading 1
>> +      ~~~~~~~  Heading 2
>> +      +++++++  Heading 3
>> +      '''''''  Heading 4
>> +
>> +      Avoid deeper levels because they do not render well.
>> +
>> +===
>> +TSO
>> +===
> You should probably add a notice that this feature is (experimental).
> 
>> +
>> +.. versionadded:: 2.11.0
> It's good practice to add the version added but traditionally we've done 
> this after the fact once the release is out. Other wise you can have 
> commits on master referring to a release that is not officially out yet.
> 
> This has come up before and in that case we waited for the release 
> before adding.
>> +
>> +TCP Segmentation Offload (TSO) is a mechanism which allows a TCP/IP stack to
>> +offload the TCP segmentation into hardware, thus saving the cycles that would
>> +be required to perform this same segmentation in Software.
> Minor: Is software capitalized for a reason above and throughout the 
> text below? I would have thought lower case would be fine.
> 
>> +
>> +TCP Segmentation Offload (TSO) enables a network stack to delegate segmentation
>> +of an oversized TCP segment to the underlying physical NIC. Offload of frame
>> +segmentation achieves computational savings in the core, freeing up CPU cycles
>> +for more useful work.
>> +
>> +A common use case for TSO is when using virtualization, where traffic that's
>> +coming in from a VM can offload the TCP segmentation, thus avoiding the
>> +fragmentation in Software. Additionally, if the traffic is headed to a VM
>> +within the same host further optimization can be expected. As the traffic never
>> +leaves the machine, no MTU needs to be accounted for, and thus no segmentation
>> +and checksum calculations are required, which saves yet more cycles. Only when
>> +the traffic actually leaves the host the segmentation needs to happen, in which
>> +case it will be performed by the egress NIC.
>> +
>> +When using TSO with DPDK, the implementation relies on the multi-segment mbufs
>> +feature, described in :doc:`/topics/dpdk/jumbo-frames`, where each mbuf
>> +contains ~2KiB of the entire packet's data and is linked to the next mbuf that
>> +contains the next portion of data.
>> +
>> +Enabling TSO
>> +~~~~~~~~~~~~
>> +Once multi-segment mbufs is enabled, TSO will be enabled by default, if there's
>> +support for it in the underlying physical NICs attached to OvS-DPDK.
>> +
> 
> Probably worth calling out above with an Important: tag.
> 
>> +When using :doc:`vHost User ports <vhost-user>`, TSO may be enabled in one of
>> +two ways, as follows.
>> +
>> +`TSO` is enabled in OvS by the DPDK vHost User backend; when a new guest
>> +connection is established, `TSO` is thus advertised to the guest as an
>> +available feature:
>> +
>> +  1. QEMU Command Line Parameter:
>> +     ```
>> +     sudo $QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 \
>> +     ...
>> +     -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,\
>> +     mrg_rxbuf=on,csum=on,gso=on,guest_csum=on,guest_tso4=on,\
>> +     guest_tso6=on,guest_ecn=on \
>> +     ...
>> +     ```
> 
> Your passing gso above as well as tso, is that required? I wasn't sure 
> myself.

Good point. I've cleaned this example to include the bare minimum; GSO
is not required here.

> 
>> +
>> +  2. ethtool
>> +
>> +Assuming that the guest's OS also supports `TSO`, ethtool can be used to enable
>> +same:
>> +
>> +    ```
>> +    ethtool -K eth0 sg on     # scatter-gather is a prerequisite for TSO
>> +    ethtool -K eth0 tso on
>> +    ethtool -k eth0
>> +    ```
>> +
>> +    <b>Note:</b> In both methods, `mergeable buffers` are required:
> Would call out above as important rather than note.
> 
>> +    ```
>> +    sudo $QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 \
>> +    ...
>> +    mrg_rxbuf=on,\
>> +    ...
>> +    ```
>> +
>> +To enable TSO in a guest, the underlying NIC must first support `TSO` - consult
>> +your controller's datasheet for compatibility. Secondly, the NIC must have an
>> +associated DPDK Poll Mode Driver (PMD) which supports `TSO`.
>> +
>> +~~~~~~~~~~~
>> +Limitations
>> +~~~~~~~~~~~
>> +The current OvS `TSO` implementation supports flat and VLAN networks only (i.e.
>> +no support for `TSO` over tunneled connection [VxLAN, GRE, IPinIP, etc.]).
>> +
>> +Also, as TSO is built on top of multi-segments mbufs, the constraints pointed
>> +out in :doc:`/topics/dpdk/jumbo-frames` also apply for TSO. Thus, some
>> +performance hits might be noticed when running specific functionalitry, like
> Minor typo in 'functionality' above.
> 
>> +the Userspace Connection tracker. And as mentioned in the same section, it is
>> +paramount that a packet's headers is contained within the first mbuf (~2KiB in
>> +size).
>> diff --git a/NEWS b/NEWS
>> index 98f5a9b..c7cfd17 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -23,6 +23,7 @@ Post-v2.10.0
>>        * Add option for simple round-robin based Rxq to PMD assignment.
>>          It can be set with pmd-rxq-assign.
>>        * Add support for DPDK 18.11
>> +     * Add support for TSO (between DPDK interfaces only).
> 
> can you add the experimental tag next to this (there's a few examples in 
> the news doc).
> 
>>      - Add 'symmetric_l3' hash function.
>>      - OVS now honors 'updelay' and 'downdelay' for bonds with LACP configured.
>>      - ovs-vswitchd:
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>> index ad7223a..7af37ee 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -374,7 +374,8 @@ struct ingress_policer {
>>   enum dpdk_hw_ol_features {
>>       NETDEV_RX_CHECKSUM_OFFLOAD = 1 << 0,
>>       NETDEV_RX_HW_CRC_STRIP = 1 << 1,
>> -    NETDEV_RX_HW_SCATTER = 1 << 2
>> +    NETDEV_RX_HW_SCATTER = 1 << 2,
>> +    NETDEV_TX_TSO_OFFLOAD = 1 << 3,
>>   };
>>   
>>   /*
>> @@ -1005,19 +1006,26 @@ dpdk_eth_dev_port_config(struct netdev_dpdk *dev, int n_rxq, int n_txq)
>>       /* Multi-segment-mbuf-specific setup. */
>>       if (dpdk_multi_segment_mbufs) {
>>           if (info.tx_offload_capa & DEV_TX_OFFLOAD_MULTI_SEGS) {
>> -            /* Enable multi-seg mbufs. DPDK PMDs typically attempt to use
>> -             * simple or vectorized transmit functions, neither of which are
>> -             * compatible with multi-segment mbufs. */
>> +            /* DPDK PMDs typically attempt to use simple or vectorized
>> +             * transmit functions, neither of which are compatible with
>> +             * multi-segment mbufs. Ensure that these are disabled when
>> +             * multi-segment mbufs are enabled.
>> +             */
> I would think this comment should be added when the conf.txmode.offloads 
> |= DEV_TX_OFFLOAD_MULTI_SEGS; cod below was introduced rather than at 
> this stage?
>>               conf.txmode.offloads |= DEV_TX_OFFLOAD_MULTI_SEGS;
>> -        } else {
>> -            VLOG_WARN("Interface %s doesn't support multi-segment mbufs",
>> -                      dev->up.name);
>> -            conf.txmode.offloads &= ~DEV_TX_OFFLOAD_MULTI_SEGS;
> Although it's removed here, I believe we discussed that rather than a 
> warning the device should fail to configure if MSEG is enabled but the 
> device doesn't support?

This seems to have been something I missed while rebasing TSO on top of
the latest multi-segs. I've fixed it now.

>> +        }
>> +
>> +        if (dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD) {
>> +            conf.txmode.offloads |= DEV_TX_OFFLOAD_TCP_TSO;
>> +            conf.txmode.offloads |= DEV_TX_OFFLOAD_TCP_CKSUM;
>> +            conf.txmode.offloads |= DEV_TX_OFFLOAD_IPV4_CKSUM;
>>           }
>>   
>>           txconf = info.default_txconf;
>> -        //txconf.txq_flags = ETH_TXQ_FLAGS_IGNORE;
> I guess this is left over from previous dev work you had, should be 
> removed from this patch for the next revision.
> 
>>           txconf.offloads = conf.txmode.offloads;
>> +    } else if (dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD) {
>> +        dev->hw_ol_features &= ~NETDEV_TX_TSO_OFFLOAD;
>> +        VLOG_WARN("Failed to set Tx TSO offload in %s. Requires option "
>> +                  "`dpdk-multi-seg-mbufs` to be enabled.", dev->up.name);
>>       }
>>   
>>       conf.intr_conf.lsc = dev->lsc_interrupt_mode;
>> @@ -1134,6 +1142,9 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
>>       uint32_t rx_chksm_offload_capa = DEV_RX_OFFLOAD_UDP_CKSUM |
>>                                        DEV_RX_OFFLOAD_TCP_CKSUM |
>>                                        DEV_RX_OFFLOAD_IPV4_CKSUM;
>> +    uint32_t tx_tso_offload_capa = DEV_TX_OFFLOAD_TCP_TSO |
>> +                                   DEV_TX_OFFLOAD_TCP_CKSUM |
>> +                                   DEV_TX_OFFLOAD_IPV4_CKSUM;
>>   
>>       rte_eth_dev_info_get(dev->port_id, &info);
>>   
>> @@ -1160,6 +1171,18 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
>>           dev->hw_ol_features &= ~NETDEV_RX_HW_SCATTER;
>>       }
>>   
>> +    if (dpdk_multi_segment_mbufs) {
>> +        if (info.tx_offload_capa & tx_tso_offload_capa) {
>> +            dev->hw_ol_features |= NETDEV_TX_TSO_OFFLOAD;
>> +        } else {
>> +            dev->hw_ol_features &= ~NETDEV_TX_TSO_OFFLOAD;
>> +            VLOG_WARN("Tx TSO offload is not supported on port "
>> +                      DPDK_PORT_ID_FMT, dev->port_id);
>> +        }
>> +    } else {
>> +        dev->hw_ol_features &= ~NETDEV_TX_TSO_OFFLOAD;
>> +    }
>> +
>>       n_rxq = MIN(info.max_rx_queues, dev->up.n_rxq);
>>       n_txq = MIN(info.max_tx_queues, dev->up.n_txq);
>>   
>> @@ -1684,6 +1707,11 @@ netdev_dpdk_get_config(const struct netdev *netdev, struct smap *args)
>>           } else {
>>               smap_add(args, "rx_csum_offload", "false");
>>           }
>> +        if (dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD) {
>> +            smap_add(args, "tx_tso_offload", "true");
>> +        } else {
>> +            smap_add(args, "tx_tso_offload", "false");
>> +        }
>>           smap_add(args, "lsc_interrupt_mode",
>>                    dev->lsc_interrupt_mode ? "true" : "false");
>>       }
>> @@ -2372,6 +2400,15 @@ netdev_dpdk_filter_packet_len(struct netdev_dpdk *dev, struct rte_mbuf **pkts,
>>       for (i = 0; i < pkt_cnt; i++) {
>>           pkt = pkts[i];
>>   
>> +        /* Drop TSO packet if there's no TSO support on egress port */
>> +        if ((pkt->ol_flags & PKT_TX_TCP_SEG) &&
>> +            !(dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD)) {
>> +            VLOG_WARN_RL(&rl, "%s: TSO is disabled on port, TSO packet dropped"
>> +                        "%" PRIu32 " ", dev->up.name, pkt->pkt_len);
>> +            rte_pktmbuf_free(pkt);
>> +            continue;
>> +        }
> Hmmm, have we changed the nature of the function now. where as before it 
> was just checking packet length now we inspect for offload type as well.
> 
> It could be better to rename the function 'netdev_dpdk_filter_packet' 
> and add an explanation in the comment preceding the function WRT what 
> filters are applied (length and TSO offload support).

Good point. I've renamed the function to the above and added a function
comment to make it clear.

> 
> Ian
>> +
>>           if (OVS_UNLIKELY(pkt->pkt_len > dev->max_packet_len)) {
>>               if (!(pkt->ol_flags & PKT_TX_TCP_SEG)) {
>>                   VLOG_WARN_RL(&rl, "%s: Too big size %" PRIu32 " "
>> @@ -4245,6 +4282,14 @@ dpdk_vhost_reconfigure_helper(struct netdev_dpdk *dev)
>>           dev->tx_q[0].map = 0;
>>       }
>>   
>> +    if (dpdk_multi_segment_mbufs) {
>> +        dev->hw_ol_features |= NETDEV_TX_TSO_OFFLOAD;
>> +
>> +        VLOG_DBG("%s: TSO enabled on vhost port", dev->up.name);
>> +    } else {
>> +        dev->hw_ol_features &= ~NETDEV_TX_TSO_OFFLOAD;
>> +    }
>> +
>>       netdev_dpdk_remap_txqs(dev);
>>   
>>       err = netdev_dpdk_mempool_configure(dev);
>>
>
diff mbox series

Patch

diff --git a/Documentation/automake.mk b/Documentation/automake.mk
index 082438e..a20deb8 100644
--- a/Documentation/automake.mk
+++ b/Documentation/automake.mk
@@ -39,6 +39,7 @@  DOC_SOURCE = \
 	Documentation/topics/dpdk/index.rst \
 	Documentation/topics/dpdk/bridge.rst \
 	Documentation/topics/dpdk/jumbo-frames.rst \
+	Documentation/topics/dpdk/tso.rst \
 	Documentation/topics/dpdk/memory.rst \
 	Documentation/topics/dpdk/pdump.rst \
 	Documentation/topics/dpdk/phy.rst \
diff --git a/Documentation/topics/dpdk/index.rst b/Documentation/topics/dpdk/index.rst
index cf24a7b..eb2a04d 100644
--- a/Documentation/topics/dpdk/index.rst
+++ b/Documentation/topics/dpdk/index.rst
@@ -40,4 +40,5 @@  The DPDK Datapath
    /topics/dpdk/qos
    /topics/dpdk/pdump
    /topics/dpdk/jumbo-frames
+   /topics/dpdk/tso
    /topics/dpdk/memory
diff --git a/Documentation/topics/dpdk/tso.rst b/Documentation/topics/dpdk/tso.rst
new file mode 100644
index 0000000..503354f
--- /dev/null
+++ b/Documentation/topics/dpdk/tso.rst
@@ -0,0 +1,111 @@ 
+..
+      Copyright 2018, Red Hat, Inc.
+
+      Licensed under the Apache License, Version 2.0 (the "License"); you may
+      not use this file except in compliance with the License. You may obtain
+      a copy of the License at
+
+          http://www.apache.org/licenses/LICENSE-2.0
+
+      Unless required by applicable law or agreed to in writing, software
+      distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+      WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+      License for the specific language governing permissions and limitations
+      under the License.
+
+      Convention for heading levels in Open vSwitch documentation:
+
+      =======  Heading 0 (reserved for the title in a document)
+      -------  Heading 1
+      ~~~~~~~  Heading 2
+      +++++++  Heading 3
+      '''''''  Heading 4
+
+      Avoid deeper levels because they do not render well.
+
+===
+TSO
+===
+
+.. versionadded:: 2.11.0
+
+TCP Segmentation Offload (TSO) is a mechanism which allows a TCP/IP stack to
+offload the TCP segmentation into hardware, thus saving the cycles that would
+be required to perform this same segmentation in Software.
+
+TCP Segmentation Offload (TSO) enables a network stack to delegate segmentation
+of an oversized TCP segment to the underlying physical NIC. Offload of frame
+segmentation achieves computational savings in the core, freeing up CPU cycles
+for more useful work.
+
+A common use case for TSO is when using virtualization, where traffic that's
+coming in from a VM can offload the TCP segmentation, thus avoiding the
+fragmentation in Software. Additionally, if the traffic is headed to a VM
+within the same host further optimization can be expected. As the traffic never
+leaves the machine, no MTU needs to be accounted for, and thus no segmentation
+and checksum calculations are required, which saves yet more cycles. Only when
+the traffic actually leaves the host the segmentation needs to happen, in which
+case it will be performed by the egress NIC.
+
+When using TSO with DPDK, the implementation relies on the multi-segment mbufs
+feature, described in :doc:`/topics/dpdk/jumbo-frames`, where each mbuf
+contains ~2KiB of the entire packet's data and is linked to the next mbuf that
+contains the next portion of data.
+
+Enabling TSO
+~~~~~~~~~~~~
+Once multi-segment mbufs is enabled, TSO will be enabled by default, if there's
+support for it in the underlying physical NICs attached to OvS-DPDK.
+
+When using :doc:`vHost User ports <vhost-user>`, TSO may be enabled in one of
+two ways, as follows.
+
+`TSO` is enabled in OvS by the DPDK vHost User backend; when a new guest
+connection is established, `TSO` is thus advertised to the guest as an
+available feature:
+
+  1. QEMU Command Line Parameter:
+     ```
+     sudo $QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 \
+     ...
+     -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,\
+     mrg_rxbuf=on,csum=on,gso=on,guest_csum=on,guest_tso4=on,\
+     guest_tso6=on,guest_ecn=on \
+     ...
+     ```
+
+  2. ethtool
+
+Assuming that the guest's OS also supports `TSO`, ethtool can be used to enable
+same:
+
+    ```
+    ethtool -K eth0 sg on     # scatter-gather is a prerequisite for TSO
+    ethtool -K eth0 tso on
+    ethtool -k eth0
+    ```
+
+    <b>Note:</b> In both methods, `mergeable buffers` are required:
+    ```
+    sudo $QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 \
+    ...
+    mrg_rxbuf=on,\
+    ...
+    ```
+
+To enable TSO in a guest, the underlying NIC must first support `TSO` - consult
+your controller's datasheet for compatibility. Secondly, the NIC must have an
+associated DPDK Poll Mode Driver (PMD) which supports `TSO`.
+
+~~~~~~~~~~~
+Limitations
+~~~~~~~~~~~
+The current OvS `TSO` implementation supports flat and VLAN networks only (i.e.
+no support for `TSO` over tunneled connection [VxLAN, GRE, IPinIP, etc.]).
+
+Also, as TSO is built on top of multi-segments mbufs, the constraints pointed
+out in :doc:`/topics/dpdk/jumbo-frames` also apply for TSO. Thus, some
+performance hits might be noticed when running specific functionalitry, like
+the Userspace Connection tracker. And as mentioned in the same section, it is
+paramount that a packet's headers is contained within the first mbuf (~2KiB in
+size).
diff --git a/NEWS b/NEWS
index 98f5a9b..c7cfd17 100644
--- a/NEWS
+++ b/NEWS
@@ -23,6 +23,7 @@  Post-v2.10.0
      * Add option for simple round-robin based Rxq to PMD assignment.
        It can be set with pmd-rxq-assign.
      * Add support for DPDK 18.11
+     * Add support for TSO (between DPDK interfaces only).
    - Add 'symmetric_l3' hash function.
    - OVS now honors 'updelay' and 'downdelay' for bonds with LACP configured.
    - ovs-vswitchd:
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index ad7223a..7af37ee 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -374,7 +374,8 @@  struct ingress_policer {
 enum dpdk_hw_ol_features {
     NETDEV_RX_CHECKSUM_OFFLOAD = 1 << 0,
     NETDEV_RX_HW_CRC_STRIP = 1 << 1,
-    NETDEV_RX_HW_SCATTER = 1 << 2
+    NETDEV_RX_HW_SCATTER = 1 << 2,
+    NETDEV_TX_TSO_OFFLOAD = 1 << 3,
 };
 
 /*
@@ -1005,19 +1006,26 @@  dpdk_eth_dev_port_config(struct netdev_dpdk *dev, int n_rxq, int n_txq)
     /* Multi-segment-mbuf-specific setup. */
     if (dpdk_multi_segment_mbufs) {
         if (info.tx_offload_capa & DEV_TX_OFFLOAD_MULTI_SEGS) {
-            /* Enable multi-seg mbufs. DPDK PMDs typically attempt to use
-             * simple or vectorized transmit functions, neither of which are
-             * compatible with multi-segment mbufs. */
+            /* DPDK PMDs typically attempt to use simple or vectorized
+             * transmit functions, neither of which are compatible with
+             * multi-segment mbufs. Ensure that these are disabled when
+             * multi-segment mbufs are enabled.
+             */
             conf.txmode.offloads |= DEV_TX_OFFLOAD_MULTI_SEGS;
-        } else {
-            VLOG_WARN("Interface %s doesn't support multi-segment mbufs",
-                      dev->up.name);
-            conf.txmode.offloads &= ~DEV_TX_OFFLOAD_MULTI_SEGS;
+        }
+
+        if (dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD) {
+            conf.txmode.offloads |= DEV_TX_OFFLOAD_TCP_TSO;
+            conf.txmode.offloads |= DEV_TX_OFFLOAD_TCP_CKSUM;
+            conf.txmode.offloads |= DEV_TX_OFFLOAD_IPV4_CKSUM;
         }
 
         txconf = info.default_txconf;
-        //txconf.txq_flags = ETH_TXQ_FLAGS_IGNORE;
         txconf.offloads = conf.txmode.offloads;
+    } else if (dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD) {
+        dev->hw_ol_features &= ~NETDEV_TX_TSO_OFFLOAD;
+        VLOG_WARN("Failed to set Tx TSO offload in %s. Requires option "
+                  "`dpdk-multi-seg-mbufs` to be enabled.", dev->up.name);
     }
 
     conf.intr_conf.lsc = dev->lsc_interrupt_mode;
@@ -1134,6 +1142,9 @@  dpdk_eth_dev_init(struct netdev_dpdk *dev)
     uint32_t rx_chksm_offload_capa = DEV_RX_OFFLOAD_UDP_CKSUM |
                                      DEV_RX_OFFLOAD_TCP_CKSUM |
                                      DEV_RX_OFFLOAD_IPV4_CKSUM;
+    uint32_t tx_tso_offload_capa = DEV_TX_OFFLOAD_TCP_TSO |
+                                   DEV_TX_OFFLOAD_TCP_CKSUM |
+                                   DEV_TX_OFFLOAD_IPV4_CKSUM;
 
     rte_eth_dev_info_get(dev->port_id, &info);
 
@@ -1160,6 +1171,18 @@  dpdk_eth_dev_init(struct netdev_dpdk *dev)
         dev->hw_ol_features &= ~NETDEV_RX_HW_SCATTER;
     }
 
+    if (dpdk_multi_segment_mbufs) {
+        if (info.tx_offload_capa & tx_tso_offload_capa) {
+            dev->hw_ol_features |= NETDEV_TX_TSO_OFFLOAD;
+        } else {
+            dev->hw_ol_features &= ~NETDEV_TX_TSO_OFFLOAD;
+            VLOG_WARN("Tx TSO offload is not supported on port "
+                      DPDK_PORT_ID_FMT, dev->port_id);
+        }
+    } else {
+        dev->hw_ol_features &= ~NETDEV_TX_TSO_OFFLOAD;
+    }
+
     n_rxq = MIN(info.max_rx_queues, dev->up.n_rxq);
     n_txq = MIN(info.max_tx_queues, dev->up.n_txq);
 
@@ -1684,6 +1707,11 @@  netdev_dpdk_get_config(const struct netdev *netdev, struct smap *args)
         } else {
             smap_add(args, "rx_csum_offload", "false");
         }
+        if (dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD) {
+            smap_add(args, "tx_tso_offload", "true");
+        } else {
+            smap_add(args, "tx_tso_offload", "false");
+        }
         smap_add(args, "lsc_interrupt_mode",
                  dev->lsc_interrupt_mode ? "true" : "false");
     }
@@ -2372,6 +2400,15 @@  netdev_dpdk_filter_packet_len(struct netdev_dpdk *dev, struct rte_mbuf **pkts,
     for (i = 0; i < pkt_cnt; i++) {
         pkt = pkts[i];
 
+        /* Drop TSO packet if there's no TSO support on egress port */
+        if ((pkt->ol_flags & PKT_TX_TCP_SEG) &&
+            !(dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD)) {
+            VLOG_WARN_RL(&rl, "%s: TSO is disabled on port, TSO packet dropped"
+                        "%" PRIu32 " ", dev->up.name, pkt->pkt_len);
+            rte_pktmbuf_free(pkt);
+            continue;
+        }
+
         if (OVS_UNLIKELY(pkt->pkt_len > dev->max_packet_len)) {
             if (!(pkt->ol_flags & PKT_TX_TCP_SEG)) {
                 VLOG_WARN_RL(&rl, "%s: Too big size %" PRIu32 " "
@@ -4245,6 +4282,14 @@  dpdk_vhost_reconfigure_helper(struct netdev_dpdk *dev)
         dev->tx_q[0].map = 0;
     }
 
+    if (dpdk_multi_segment_mbufs) {
+        dev->hw_ol_features |= NETDEV_TX_TSO_OFFLOAD;
+
+        VLOG_DBG("%s: TSO enabled on vhost port", dev->up.name);
+    } else {
+        dev->hw_ol_features &= ~NETDEV_TX_TSO_OFFLOAD;
+    }
+
     netdev_dpdk_remap_txqs(dev);
 
     err = netdev_dpdk_mempool_configure(dev);