[ovs-dev,v16,00/14] Support multi-segment mbufs
mbox series

Message ID 20190911133003.720-1-michalx.obrembski@intel.com
Headers show
Series
  • Support multi-segment mbufs
Related show

Message

Michal Obrembski Sept. 11, 2019, 1:29 p.m. UTC
Overview
========
This patchset introduces support for multi-segment mbufs to OvS-DPDK.
Multi-segment mbufs are typically used when the size of an mbuf is
insufficient to contain the entirety of a packet's data. Instead, the
data is split across numerous mbufs, each carrying a portion, or
'segment', of the packet data. Mbufs are chained via their 'next'
attribute (an mbuf pointer).

The main motivation behind the support for multi-segment mbufs is to
later introduce TSO (use case i. below) / GRO in OvS-DPDK, which is
planned to be introduced after this series.

Use Cases
=========
i.  Handling oversized (guest-originated) frames, which are marked
    for hardware accelration/offload (TSO, for example).

    Packets which originate from a non-DPDK source may be marked for
    offload; as such, they may be larger than the permitted ingress
    interface's MTU, and may be stored in an oversized dp-packet. In
    order to transmit such packets over a DPDK port, their contents
    must be copied to a DPDK mbuf (via dpdk_do_tx_copy). However, in
    its current implementation, that function only copies data into
    a single mbuf; if the space available in the mbuf is exhausted,
    but not all packet data has been copied, then it is lost.
    Similarly, when cloning a DPDK mbuf, it must be considered
    whether that mbuf contains multiple segments. Both issues are
    resolved within this patchset.

ii. Handling jumbo frames.

    While OvS already supports jumbo frames, it does so by increasing
    mbuf size, such that the entirety of a jumbo frame may be handled
    in a single mbuf. This is certainly the preferred, and most
    performant approach (and remains the default).

Enabling multi-segment mbufs
============================
Multi-segment and single-segment mbufs are mutually exclusive, and the
user must decide on which approach to adopt on init. The introduction
of a new OVSDB field, 'dpdk-multi-seg-mbufs', facilitates this.

This is a global boolean value, which determines how jumbo frames are
represented across all DPDK ports. In the absence of a user-supplied
value, 'dpdk-multi-seg-mbufs' defaults to false, i.e. multi-segment
mbufs must be explicitly enabled / single-segment mbufs remain the
default.

Setting the field is identical to setting existing DPDK-specific OVSDB
fields:

    ovs-vsctl set Open_vSwitch . other_config:dpdk-init=true
    ovs-vsctl set Open_vSwitch . other_config:dpdk-lcore-mask=0x10
    ovs-vsctl set Open_vSwitch . other_config:dpdk-socket-mem=4096,0
==> ovs-vsctl set Open_vSwitch . other_config:dpdk-multi-seg-mbufs=true

Performance notes (based on v8, 1st non-RFC)
=================
In order to test for regressions in performance, tests were run on top
of master 88125d6 and v8 of this patchset, both with the multi-segment
mbufs option enabled and disabled.

VSperf was used to run the phy2phy_cont and pvp_cont tests with varying
packet sizes of 64B, 1500B and 7000B, on a 10Gbps interface.

Test | Size | Master | Multi-seg disabled | Multi-seg enabled
-------------------------------------------------------------
p2p  |  64  | ~22.7  |      ~22.65        |       ~18.3
p2p  | 1500 |  ~1.6  |        ~1.6        |        ~1.6
p2p  | 7000 | ~0.36  |       ~0.36        |       ~0.36
pvp  |  64  |  ~6.7  |        ~6.7        |        ~6.3
pvp  | 1500 |  ~1.6  |        ~1.6        |        ~1.6
pvp  | 7000 | ~0.36  |       ~0.36        |       ~0.36

Packet size is in bytes, while all packet rates are reported in mpps
(aggregated).

No noticeable regression has been observed (certainly everything is
within the ± 5% margin of existing performance), aside from the 64B
packet size case when multi-segment mbuf is enabled. This is
expected, however, because of how Tx vectoriszed functions are
incompatible with multi-segment mbufs on some PMDs. The PMD under
use during these tests was the i40e (on a Intel X710 NIC), which
indeed doesn't support vectorized Tx functions with multi-segment
mbufs.

This is mostly rebased into current master a work started by Tiago Lam
in https://patchwork.ozlabs.org/cover/1023974/.

---
v16: - Fixed a problem with build when checkouted on 
       dp-packet: copy data from multi-seg. DPDK mbuf
	 
v15: - Rebase on master e64c2c1 ("rhel: Fix ovs-kmod-manage.sh 
       to work with RHEL 7.3");
     - Fixed unit tests regression introduced in previous patch;
     - Fixed compilation on clang.

v14: - Rebase on master adb3f0b ("python: Avoid flake8 warning
       for unused variables.");
     - Reorder changes to dp_packet_l2_5/l3/l4() from patch 06/11 to patch
              03/11 (Ian Stokes);
     - Fix non-null warning in GCC 8.2.1 (06/11, David
              Merchant);
     - Linearize packet in process_one() only, when entering
              Userspace Conntrack (06/11, Darrell Ball);
     - Change dp_packet_linearize() logic to be aware if a
              packet is linear and bail out early if so (06/11, Darrell
       Ball);
     - Add function header comments to new functions introduced
              in lib/packets.c (06/11, Darrell Ball);
     - Fix leakeage in dp_packet_linearize() when calling rte_pktmbuf_read(),
              if error occurs (06/11, Ilya Maximets);
     - Return miniflow_extract() error in flow_extract() and check error in
              callers, appropriately (06/11, Ilya Maximets).

v13: - Fix patch 05/11, which was missing the copy of the mbufs flags in
       dp_packet_copy_mbuf_flags();
     - Re-order changes in dp_packet_copy_mbuf_flags() back to patch 05/11,
              instead of being done later in patch 06/11.

v12: - Rebase on master 46df7fa ("netdev-tc-offloads: Support IPv6 hlimit
       rewrite");
     - Previous patchset v11 took the approach of modifying the
              dp_packet_l2_5/l3/l4() functions so the size of the header to be fetched
       would be passed by the caller. This, however, would mean that many places
       throughout the code base would need to be modified. Instead, v12 takes
       the approach of, when multi-segments is enabled, verifying that incoming
       packets have their respective headers in the first mbuf. This is done in
       miniflow_extract(), which now may return an error;
     - Now that we have moved to DPDK 18.11, where more devices are reporting
              their offload capabilities, check for the DEV_TX_OFFLOAD_MULTI_SEGS
       offload capability before setting it;
     - Add comment to dp_packet_set_size() in dp-packet.h, to clarify its
              behaviour (Ian Stokes);
     - Fix coding style in several comments (Ian Stokes).

v11: - Rebase on master 35fe9ef ("dpif-netdev: Add vlan to mask for flow_put
       operation.");
     - Address Flavio's comments:
              - Remove unneeded RTE_PKTMBUF_HEADROOM used to extend an mbufs' data
                         room when calling rte_pktmbuf_pool_create();
       - Remove MIN() condition in __packet_set_data() as the condition could
                  never go above the packet's size;
       - Move dp_packet_copy_mbuf_flags() to the header file and fix a leak
                  when calling dp_packet_clone_with_headroom();
       - Remove condition from dp_packet_linearize(). Callers must ensure the
                  packet is non-linear before calling this function;
       - Add new dp_packet_read_data() which enables callers to get a specific
                  portion of a dp_packet's data, copied only as last resort;
     - Address Ilya's comments:
              - Fix failing STP tests;
                       - Improve dp_packet_equal() to consider packets identical in data but
                                  different in mbuf layout;
     - New packet_csum() and packet_crc32c() now provide a new way to calculate
              the checksum and crc32 of a dp_packet, by taking advantage of the newly
       introduced dp_packet_read_data();
     - Move the code base, where appropriate, to use the new packet_csum() and
              packet_crc32c() APIs where appropriate.

v10: - Rebase on master 0d5450a ("ovsdb-client: Fix a bug that uses wrong
       index");
     - Address Ilya's comments:
              - Fix dp_packet_reset() to not trim the packet;
                       - Remove unused netdev_dpdk_is_multi_segment_mbufs_enabled() function;
                                - Modify the dp_packet_l[2_5|3|4] layer functions to check that enough
                                           data is present in the packet before returning. The callers were also
         modified to act accordingly.
     - Add comment to dp_packet_set_size() to clarify its usage and modify
              dp_packet_pull() to comply with that;
     - Modified slightly the "linearization" approach. Instead of making it an
              implict operation there are now two functions, dp_packet_is_linear() and
       dp_packet_linearize(), that enable the callers to explicity check if a
       packet needs linearization and linearize it if needed;
     - Briefly mention the Userspace Conntrack as one of the "limitations" when
              using multi-segment mbufs.

v9: - Rebase on master e4e2009 ("tunnel, tests: Sort flow output in ERSPAN
      v1/v2 metadata");
    - Simplify patch 09/14. The functions introduced in packets.c were dropped
            so the code in netdev-native-tnl.c remains largely the same. These can be
      introduced at a later time, if needed (maybe when csum across segmented
      data is introduced);

v8: - Rebase on master 1dd218a ("ovsdb-idl: Fix recently introduced Python 3
      tests.");
    - Address Ian's comment:
            - Fix sparse warnings on patch 07/14 and 12/14 by allocating memory
                      dynamically.
    - Address Ilya's comments:
            - netdev_linux_tap_batch_send() and udp_extract_tnl_md() now linearize
                      the data before hand, beforing write()'ing or performing the checksums
        in the data;
      - Some other cases have been found and adapted; The new patch 09/14
                introduced in the series is where the "linearization" logic is
        introduced and, as a consequence, some users of the dp_packet API,
        which were assuming the data is held contiguously in memory, are
        changed to use the new APIs.
    - Add support for multi-segment mbufs to dp_packet_equal() (patch 06/14);
          - Fix a bug in patch 08/14 where the call to dp_packet_copy_mbuf_flags() in
                  dp_packet_clone_with_headroom() was setting incorrectly the nb_segs field
      on the destination mbuf;
    - Add unit-tests for dp_packet_equal() and (new) dp_packet_linear_data() to
            patch 12/14;
    - Add a comment to jumbo-frames.rst under topics/dpdk/ to warn how
            multi-segments mbufs may affect performance when using large packets
      across DPDK and non-DPDK ports.

v7: - Rebase on master 024810c ("Prepare for post-2.10.0 (2.10.90).");
    - Add Ben's proposed fix for automake's warning;
          - Add a note to cover letter to explain this is preperatory work for TSO /
                  GRO.

v6: - Rebase on master d1b235d ("tests: Add test for ovn-nbctl's command parser
      error paths.");
    - Address Darrell's comments:
            - The changes in dp_packet_resize__() were trying to alleviate the call
                      to OVS_NOT_REACHED() for DPDK packets, by trying to reuse the available
        tailroom space when no more headroom space is available, and vice-versa.
        However, this was breaking the API for the dp_packet_resize__()
        function (only called in dp_packet_prealloc_tailroom() and
        dp_packet_prealloc_headroom()), which doesn't seem to suit the purposes
        for DPDK packets.
        Instead, and because this is isolate funtionality, revert to the
        previous state where dp_packet_resize__() is not supported for DPDK
        packets. Hence, then patch 08/14 has been dropped.
    - Additionally, fix the tests that were relying on the removed
            functionality.

v5: - Rebase on master 030958a0cc ("conntrack: Fix conn_update_state_alg use
      after free.");
    - Address Eelco's comments:
            - Remove dpdk_mp_sweep() call in netdev_dpdk_mempool_configure(), a
                      leftover from rebase. Only call should be in dpdk_mp_get();
      - Remove NEWS line added by mistake during rebase (about adding
                experimental vhost zero copy support).
    - Address Ian's comments:
            - Drop patch 01 from previous series entirely;
                    - Patch (now) 01/14 adds a new call to dpdk_buf_size() inside
                              dpdk_mp_create() to get the correct "mbuf_size" to be used;
      - Patch (now) 11/14 modifies dpdk_mp_create() to check if multi-segment
                mbufs is enabled, in which case it calculates the new "mbuf_size" to be
        used;
      - In free_dpdk_buf() and dpdk_buf_alloc(), don't lock and unlock
                conditionally.
    - Add "per-port-memory=true" to test "Multi-segment mbufs Tx" as the current
            DPDK set up in system-dpdk-testsuite can't handle higher MTU sizes using
      the shared mempool model (runs out of memory);
    - Add new examples for when multi-segment mbufs are enabled in
            topics/dpdk/memory.rst, and a reference to topics/dpdk/jumbo-frames.rst
      (now patch 11/14).

v4: - Rebase on master b22fb00 ("ovn-nbctl: Clarify error messages in qos-add
      command."):
      - A new patch (now 01/15) has been introduced to differentiate between
                MTU and mbuf size when creating the mempool. This is because as part of
        the new support for both per port and shared mempools, mempools were
        being reused based on the mbuf size, and for multi-segment mbufs the
        mbuf size can end up being the same for all mbufs;
      - A couple of other patches (now 02/15 and 12/15) have been modified as
                part of the rebase, but only to adapt to the code changes to "Support
        both shared and per port mempools.", no functionality should have been
        changed.

v3:
    - Address Eelco's comment:
              - Fix the ovs_assert() introduced in v2 in __packet_set_data(), which
                          wasn't correctly asserting that the passed 'v' was smaller than the
          first mbuf's buf_len.

v2:
    - Rebase on master e7cd8cf ("db-ctl-base: Don't die in cmd_destroy() on
            error.");
    - Address Eelco's comments:
              - In free_dpdk_buf(), use mbuf's struct address in dp_packet instead of
                          casting;
        - Remove unneeded variable in dp_packet_set_size(), pointing to the
                    first mbuf in the chain;
        - Assert in dp_packet_set_size() to enforce that "pkt_len == v" is
                    always true for DPBUF_DPDK packets;
        - Assert in __packet_set_data() to enforce that data_off never goes
                    beyond the first mbuf in the chain.

v1:
    - v8 should have been sent as v1 really, as that's usually the approach
            followed in OvS. That clearly didn't happen, so restarting the series
      now. This also helps making it clear it is no longer an RFC series;
    - Rebase on master e461481 ("ofp-meter: Fix ofp_print_meter_flags()
            output.");
    - Address Eelco's comments:
              - Change dp_packet_size() so that for `DPBUF_DPDK` packets their
                          `pkt_len` and `data_len` can't be set to values bigger than the
          available space. Also fix assigment to `data_len` which was
          incorrectly being set to just`pkt_len`;
        - Improve `nonpmd_mp_mutex` comment with a better explanation as to
                    why the mutex is needed;
        - Fix dp_packet_clear() to not call rte_pktmbuf_reset() for non
                    `DPBUF_DPDK` packets;
        - Dropped `if` clause in dp_packet_l4_size(), keep just the `else`;
                  - Change dp_packet_clone_with_headroom() to use rte_pktmbuf_read() for
                              copying `DPBUF_DPDK` packets' data. Also, change it to return
          appropriate and meaningful errors, instead of just "0" or "1";
        - Change dpdk_prep_tx_buf() name's to dpdk_clone_dp_packet_to_mbuf(),
                    and reuse dp_packet_mbuf_write() instead of manual copy;
        - Add note vswitchd/vswitch.xml to make it clear the enabling of
                    dpdk-multi-seg-mbufs requires a restart;
    - Change dpdk_mp_create() to increase # mbufs used under the multi-segment
            mbufs approach;
    - Increase test coverage by adding "end-to-end" tests that verify that
            "dpdk-multi-seg-mbufs" is disabled by default and that a packet is
      successfully sent out;
    - Some helper funcs such as dp_packet_tail() and dp_packet_end() were moved
            back to be common between `DPBUF_DPDK` and non `DPBUF_DPDK` packets, to
      minimise changes;
    - Add some extra notes to "Performance notes" in jumbo-frames.rst doc,
            after further testing;
    - Structure changes:
            - Drop patch 07/13 which is now unneeded;
                    - Two more patches added for extra test coverage. This is what accounts
                              for the increase in size (+1 patch) in the series.

v8 (non-RFC): 
    - Rebase on master 88125d6 ("rhel: remove ovs-sim man page from
            temporary directory (also for RHEL)");
    - Address Ciara's and Ilya's comments:
              - Drop the dp_packet_mbuf_tail() function and use only the
                          already existing dp_packet_tail();
        - Fix bug in dpdk_do_tx_copy() where the error return from
                    dpdk_prep_tx_buf() was being wrongly checked;
        - Use dpdk_buf_alloc() and free_dpdk_buf() instead of
                    rte_pktmbuf_alloc() and rte_pktmbuf_free();
        - Fix some other code style and duplication issues pointed out.
              - Refactored dp_packet_shift(), dp_packet_resize__() and
                      dp_packet_put*() functions to work within the bounds of existing
      mbufs only;
    - Fix dp_packet_clear() which wasn't correctly clearing / freeing
            other mbufs in the chain for chains with more than a single mbuf;
    - dp_packet layer functions (such as dp_packet_l3()) now check if
            the header is within the first mbuf, when using mbufs;
    - Move patch 08/13 to before patch 04/13, since dp_packet_set_size()
            was refactored to use free_dpdk_buf();
    - Fix wrong rte_memcpy() when performing dp_packet_clone() which was
            leading to memory corruption; 
    - Modified the added tests to account for some of the above changes;
          - Run performance tests, compiling results and adding them to the
                  cover letter;
    - Add a multi-seg mbufs explanation to the jumbo-frames.rst doc,
            together with a "Performance notes" sub-section reflecting the
      findings mentioned above in the cover letter.

v7:  - Rebase on master 5e720da ("erspan: fix invalid erspan version.");
     - Address Ilya comments;
               - Fix non-DPDK build;
                         - Serialise the access of non pmds to allocation and free of mbufs by
                                     using a newly introduced mutex.
     - Add a new set of tests that integrates with the recently added DPDK
              testsuite. These focus on allocating dp_packets, with a single or
       multiple mbufs, from an instantiated mempool and performing several
       operations on those, verifying if the data at the end matches what's
       expected;
     - Fix bugs found by the new tests:
               - dp_packet_shift() wasn't taking into account shift lefts;
                         - dp_packet_resize__() was misusing and miscalculating the tailrooms
                                     and headrooms, ending up calculating the wrong number of segments
          that needed allocation;
        - An mbuf's end was being miscalculated both in dp_packet_tail,
                    dp_packet_mbuf_tail() and dp_packet_end();
        - dp_packet_set_size() was not updating the number of chained segments
                    'nb_segs';
     - Add support for multi-seg mbufs in dp_packet_clear().

v6:  - Rebase on master 7c0cb29 ("conntrack-tcp: Handle tcp session
       reuse.");
     - Further improve dp_packet_put_uninit() and dp_packet_shift() to
              support multi-seg mbufs;
     - Add support for multi-seg mbufs in dp_packet_l4_size() and
              improve other helper funcs, such as dp_packet_tail() and dp_
       packet_tailroom().
     - Add support for multi-seg mbufs in dp_packet_put(), dp_packet_
              put_zeros(), as well as dp_packet_resize__() - allocating new
       mbufs and linking them together;
     Restructured patchset:
     - Squash patch 5 into patch 6, since they were both related to
              copying data while handling multi-seg mbufs;
     - Split patch 4 into two separate patches - one that introduces the
              changes in helper functions to deal with multi-seg mbufs and
       two others for the shift() and put_uninit() functionality;
     - Move patch 4 to before patch 3, so that ihelper functions come
              before functionality improvement that rely on those helpers.

v5: - Rebased on master e5e22dc ("datapath-windows: Prevent ct-counters
      from getting redundantly incremented");
    - Sugesh's comments have been addressed:
            - Changed dp_packet_set_data() and dp_packet_set_size() logic to
                      make them independent of each other;
      - Dropped patch 3 now that dp_packet_set_data() and dp_packet_set_
                size() are independent;
      - dp_packet_clone_with_headroom() now has split functions for
                handling DPDK sourced packets and non-DPDK packets;
    - Modified various functions in dp-packet.h to account for multi-seg
            mbufs - dp_packet_put_uninit(), dp_packet_tail(), dp_packet_tail()
      and dp_packet_at();
    - Added support for shifting packet data in multi-seg mbufs, using
            dp_packet_shift();
    - Fixed some minor inconsistencies.

    Note that some of the changes in v5 have been contributed by Mark
    Kavanagh as well.

v4: - restructure patchset
    - account for 128B ARM cacheline when sizing mbufs

Artur Twardowski (1):
  dp-packet: Fix invalid size of ICMPv6 header

Mark Kavanagh (2):
  netdev-dpdk: copy large packet to multi-seg. mbufs
  netdev-dpdk: support multi-segment jumbo frames.

Michael Qiu (1):
  dp-packet: copy data from multi-seg. DPDK mbuf

Michal Obrembski (3):
  dpdk-tests: Fix Multi-segment DPDK Unittests
  Fix build without DPDK
  Fix DPDK MBUF tests compilation on some compilers

Tiago Lam (8):
  netdev-dpdk: Serialise non-pmds mbufs' alloc/free.
  dp-packet: Fix data_len handling multi-seg mbufs.
  dp-packet: Handle multi-seg mbufs in helper funcs.
  dp-packet: Handle multi-seg mubfs in shift() func.
  dp-packet: Add support for data "linearization".
  dpdk-tests: Add unit-tests for multi-seg mbufs.
  dpdk-tests: Accept other configs in OVS_DPDK_START
  dpdk-tests: End-to-end tests for multi-seg mbufs.

 Documentation/topics/dpdk/jumbo-frames.rst |  73 +++
 Documentation/topics/dpdk/memory.rst       |  36 ++
 NEWS                                       |   1 +
 lib/conntrack.c                            |   5 +
 lib/crc32c.c                               |  17 +-
 lib/crc32c.h                               |   2 +
 lib/dp-packet.c                            | 168 ++++++-
 lib/dp-packet.h                            | 497 +++++++++++++++++---
 lib/dpdk.c                                 |   8 +
 lib/dpif-netdev.c                          |  18 +-
 lib/dpif-netlink.c                         |   3 +
 lib/dpif.c                                 |   6 +
 lib/flow.c                                 | 110 ++++-
 lib/flow.h                                 |   4 +-
 lib/mcast-snooping.c                       |   2 +
 lib/netdev-bsd.c                           |   3 +
 lib/netdev-dpdk.c                          | 189 +++++++-
 lib/netdev-dpdk.h                          |   1 +
 lib/netdev-dummy.c                         |   6 +
 lib/netdev-linux.c                         |   6 +
 lib/netdev-native-tnl.c                    |  26 +-
 lib/odp-execute.c                          |  24 +-
 lib/packets.c                              |  96 +++-
 lib/packets.h                              |   7 +
 ofproto/ofproto-dpif-upcall.c              |  21 +-
 ofproto/ofproto-dpif-xlate.c               |  27 +-
 tests/automake.mk                          |  10 +-
 tests/dpdk-packet-mbufs.at                 |   7 +
 tests/system-dpdk-macros.at                |   7 +-
 tests/system-dpdk-testsuite.at             |   1 +
 tests/system-dpdk.at                       |  67 +++
 tests/test-dpdk-mbufs.c                    | 722 +++++++++++++++++++++++++++++
 tests/test-rstp.c                          |   9 +-
 tests/test-stp.c                           |   9 +-
 vswitchd/vswitch.xml                       |  22 +
 35 files changed, 2064 insertions(+), 146 deletions(-)
 create mode 100644 tests/dpdk-packet-mbufs.at
 create mode 100644 tests/test-dpdk-mbufs.c

Comments

Flavio Leitner Sept. 23, 2019, 2:06 p.m. UTC | #1
Hi Michal,

First of all thank you for continuing with the TSO work.

I spent a bit of time reviewing the patchset and my impression is
that the multi-segment support is quite expensive. Even when TSO
is off, we still have a non trivial CPU cost which we can't optimize
further. Also an additional complexity when dealing with packets
to maintain the code in the long term.

Giving that CPU is a bottleneck that can't be easily upgraded but 
memory can be expanded, I thought we could look at an alternative
way that uses more memory instead of relying on CPU. I hope that
this work helps to best decide and support the final solution.

Different of Mark's (2016?) approach of changing the default mpool
to 64k, this creates one specific memory pool with element size of
64k enough to hold the maximum packet size. It would guarantee that
the packet is always linear, so we don't need to change any of packet
manipulation functions in OvS. However, we do need to change DPDK
API to accept one more memory pool.

Therefore, I created one extra mpool to hold 64k data packet without
local cache and handled that to rte_vhost_dequeue_burst() which
based on the virtio buffer length decides whether to use the MTU
sized or 64k sized mpool.

The PoC code is not complete as I am sure packet copies, for example,
need to consider the packet size to allocate the new buffer from the
right mpool, but it allows iperf3 to work from a VM using vhost-user
client to a server to show the potential results.

These are the results[*] using iperf3:
 = Baremetal to Baremetal, over 40G link
  [ ID] Interval           Transfer     Bandwidth       Retr
  [  4]   0.00-10.00  sec  40.3 GBytes  34.6 Gbits/sec   83 sender
  [  4]   0.00-10.00  sec  40.3 GBytes  34.6 Gbits/sec receiver

 = VM to baremetal using 2.12 without changes:
  [ ID] Interval           Transfer     Bitrate         Retr
  [  5]   0.00-10.00  sec  14.2 GBytes  12.2 Gbits/sec    0 sender
  [  5]   0.00-10.00  sec  14.2 GBytes  12.2 Gbits/sec receiver

 = VM to baremetal using 2.12 plus patches:
  [ ID] Interval           Transfer     Bitrate         Retr
  [  5]   0.00-10.00  sec  43.9 GBytes  37.7 Gbits/sec   81 sender
  [  5]   0.00-10.00  sec  43.9 GBytes  37.7 Gbits/sec receiver

As you can see, the VM is pushing at line rate speed.

I will post the patches as a reply to this email.
The code is also available on my github account:
  https://github.com/fleitner/ovs/tree/tso-enabled-vhost-v1
  https://github.com/fleitner/dpdk/tree/ovs2.12-vhost-tso-v1

[*] Those are quick results, just to a have a confirmation. I would
need to run many more times and also with testpmd + trex to see pps
numbers with different packet sizes to get a real picture, of course.

Thanks,
fbl


On Wed, Sep 11, 2019 at 03:29:49PM +0200, Michal Obrembski wrote:
> 
> Overview
> ========
> This patchset introduces support for multi-segment mbufs to OvS-DPDK.
> Multi-segment mbufs are typically used when the size of an mbuf is
> insufficient to contain the entirety of a packet's data. Instead, the
> data is split across numerous mbufs, each carrying a portion, or
> 'segment', of the packet data. Mbufs are chained via their 'next'
> attribute (an mbuf pointer).
> 
> The main motivation behind the support for multi-segment mbufs is to
> later introduce TSO (use case i. below) / GRO in OvS-DPDK, which is
> planned to be introduced after this series.
> 
> Use Cases
> =========
> i.  Handling oversized (guest-originated) frames, which are marked
>     for hardware accelration/offload (TSO, for example).
> 
>     Packets which originate from a non-DPDK source may be marked for
>     offload; as such, they may be larger than the permitted ingress
>     interface's MTU, and may be stored in an oversized dp-packet. In
>     order to transmit such packets over a DPDK port, their contents
>     must be copied to a DPDK mbuf (via dpdk_do_tx_copy). However, in
>     its current implementation, that function only copies data into
>     a single mbuf; if the space available in the mbuf is exhausted,
>     but not all packet data has been copied, then it is lost.
>     Similarly, when cloning a DPDK mbuf, it must be considered
>     whether that mbuf contains multiple segments. Both issues are
>     resolved within this patchset.
> 
> ii. Handling jumbo frames.
> 
>     While OvS already supports jumbo frames, it does so by increasing
>     mbuf size, such that the entirety of a jumbo frame may be handled
>     in a single mbuf. This is certainly the preferred, and most
>     performant approach (and remains the default).
> 
> Enabling multi-segment mbufs
> ============================
> Multi-segment and single-segment mbufs are mutually exclusive, and the
> user must decide on which approach to adopt on init. The introduction
> of a new OVSDB field, 'dpdk-multi-seg-mbufs', facilitates this.
> 
> This is a global boolean value, which determines how jumbo frames are
> represented across all DPDK ports. In the absence of a user-supplied
> value, 'dpdk-multi-seg-mbufs' defaults to false, i.e. multi-segment
> mbufs must be explicitly enabled / single-segment mbufs remain the
> default.
> 
> Setting the field is identical to setting existing DPDK-specific OVSDB
> fields:
> 
>     ovs-vsctl set Open_vSwitch . other_config:dpdk-init=true
>     ovs-vsctl set Open_vSwitch . other_config:dpdk-lcore-mask=0x10
>     ovs-vsctl set Open_vSwitch . other_config:dpdk-socket-mem=4096,0
> ==> ovs-vsctl set Open_vSwitch . other_config:dpdk-multi-seg-mbufs=true
> 
> Performance notes (based on v8, 1st non-RFC)
> =================
> In order to test for regressions in performance, tests were run on top
> of master 88125d6 and v8 of this patchset, both with the multi-segment
> mbufs option enabled and disabled.
> 
> VSperf was used to run the phy2phy_cont and pvp_cont tests with varying
> packet sizes of 64B, 1500B and 7000B, on a 10Gbps interface.
> 
> Test | Size | Master | Multi-seg disabled | Multi-seg enabled
> -------------------------------------------------------------
> p2p  |  64  | ~22.7  |      ~22.65        |       ~18.3
> p2p  | 1500 |  ~1.6  |        ~1.6        |        ~1.6
> p2p  | 7000 | ~0.36  |       ~0.36        |       ~0.36
> pvp  |  64  |  ~6.7  |        ~6.7        |        ~6.3
> pvp  | 1500 |  ~1.6  |        ~1.6        |        ~1.6
> pvp  | 7000 | ~0.36  |       ~0.36        |       ~0.36
> 
> Packet size is in bytes, while all packet rates are reported in mpps
> (aggregated).
> 
> No noticeable regression has been observed (certainly everything is
> within the ± 5% margin of existing performance), aside from the 64B
> packet size case when multi-segment mbuf is enabled. This is
> expected, however, because of how Tx vectoriszed functions are
> incompatible with multi-segment mbufs on some PMDs. The PMD under
> use during these tests was the i40e (on a Intel X710 NIC), which
> indeed doesn't support vectorized Tx functions with multi-segment
> mbufs.
> 
> This is mostly rebased into current master a work started by Tiago Lam
> in https://patchwork.ozlabs.org/cover/1023974/.
> 
> ---
> v16: - Fixed a problem with build when checkouted on 
>        dp-packet: copy data from multi-seg. DPDK mbuf
> 	 
> v15: - Rebase on master e64c2c1 ("rhel: Fix ovs-kmod-manage.sh 
>        to work with RHEL 7.3");
>      - Fixed unit tests regression introduced in previous patch;
>      - Fixed compilation on clang.
> 
> v14: - Rebase on master adb3f0b ("python: Avoid flake8 warning
>        for unused variables.");
>      - Reorder changes to dp_packet_l2_5/l3/l4() from patch 06/11 to patch
>               03/11 (Ian Stokes);
>      - Fix non-null warning in GCC 8.2.1 (06/11, David
>               Merchant);
>      - Linearize packet in process_one() only, when entering
>               Userspace Conntrack (06/11, Darrell Ball);
>      - Change dp_packet_linearize() logic to be aware if a
>               packet is linear and bail out early if so (06/11, Darrell
>        Ball);
>      - Add function header comments to new functions introduced
>               in lib/packets.c (06/11, Darrell Ball);
>      - Fix leakeage in dp_packet_linearize() when calling rte_pktmbuf_read(),
>               if error occurs (06/11, Ilya Maximets);
>      - Return miniflow_extract() error in flow_extract() and check error in
>               callers, appropriately (06/11, Ilya Maximets).
> 
> v13: - Fix patch 05/11, which was missing the copy of the mbufs flags in
>        dp_packet_copy_mbuf_flags();
>      - Re-order changes in dp_packet_copy_mbuf_flags() back to patch 05/11,
>               instead of being done later in patch 06/11.
> 
> v12: - Rebase on master 46df7fa ("netdev-tc-offloads: Support IPv6 hlimit
>        rewrite");
>      - Previous patchset v11 took the approach of modifying the
>               dp_packet_l2_5/l3/l4() functions so the size of the header to be fetched
>        would be passed by the caller. This, however, would mean that many places
>        throughout the code base would need to be modified. Instead, v12 takes
>        the approach of, when multi-segments is enabled, verifying that incoming
>        packets have their respective headers in the first mbuf. This is done in
>        miniflow_extract(), which now may return an error;
>      - Now that we have moved to DPDK 18.11, where more devices are reporting
>               their offload capabilities, check for the DEV_TX_OFFLOAD_MULTI_SEGS
>        offload capability before setting it;
>      - Add comment to dp_packet_set_size() in dp-packet.h, to clarify its
>               behaviour (Ian Stokes);
>      - Fix coding style in several comments (Ian Stokes).
> 
> v11: - Rebase on master 35fe9ef ("dpif-netdev: Add vlan to mask for flow_put
>        operation.");
>      - Address Flavio's comments:
>               - Remove unneeded RTE_PKTMBUF_HEADROOM used to extend an mbufs' data
>                          room when calling rte_pktmbuf_pool_create();
>        - Remove MIN() condition in __packet_set_data() as the condition could
>                   never go above the packet's size;
>        - Move dp_packet_copy_mbuf_flags() to the header file and fix a leak
>                   when calling dp_packet_clone_with_headroom();
>        - Remove condition from dp_packet_linearize(). Callers must ensure the
>                   packet is non-linear before calling this function;
>        - Add new dp_packet_read_data() which enables callers to get a specific
>                   portion of a dp_packet's data, copied only as last resort;
>      - Address Ilya's comments:
>               - Fix failing STP tests;
>                        - Improve dp_packet_equal() to consider packets identical in data but
>                                   different in mbuf layout;
>      - New packet_csum() and packet_crc32c() now provide a new way to calculate
>               the checksum and crc32 of a dp_packet, by taking advantage of the newly
>        introduced dp_packet_read_data();
>      - Move the code base, where appropriate, to use the new packet_csum() and
>               packet_crc32c() APIs where appropriate.
> 
> v10: - Rebase on master 0d5450a ("ovsdb-client: Fix a bug that uses wrong
>        index");
>      - Address Ilya's comments:
>               - Fix dp_packet_reset() to not trim the packet;
>                        - Remove unused netdev_dpdk_is_multi_segment_mbufs_enabled() function;
>                                 - Modify the dp_packet_l[2_5|3|4] layer functions to check that enough
>                                            data is present in the packet before returning. The callers were also
>          modified to act accordingly.
>      - Add comment to dp_packet_set_size() to clarify its usage and modify
>               dp_packet_pull() to comply with that;
>      - Modified slightly the "linearization" approach. Instead of making it an
>               implict operation there are now two functions, dp_packet_is_linear() and
>        dp_packet_linearize(), that enable the callers to explicity check if a
>        packet needs linearization and linearize it if needed;
>      - Briefly mention the Userspace Conntrack as one of the "limitations" when
>               using multi-segment mbufs.
> 
> v9: - Rebase on master e4e2009 ("tunnel, tests: Sort flow output in ERSPAN
>       v1/v2 metadata");
>     - Simplify patch 09/14. The functions introduced in packets.c were dropped
>             so the code in netdev-native-tnl.c remains largely the same. These can be
>       introduced at a later time, if needed (maybe when csum across segmented
>       data is introduced);
> 
> v8: - Rebase on master 1dd218a ("ovsdb-idl: Fix recently introduced Python 3
>       tests.");
>     - Address Ian's comment:
>             - Fix sparse warnings on patch 07/14 and 12/14 by allocating memory
>                       dynamically.
>     - Address Ilya's comments:
>             - netdev_linux_tap_batch_send() and udp_extract_tnl_md() now linearize
>                       the data before hand, beforing write()'ing or performing the checksums
>         in the data;
>       - Some other cases have been found and adapted; The new patch 09/14
>                 introduced in the series is where the "linearization" logic is
>         introduced and, as a consequence, some users of the dp_packet API,
>         which were assuming the data is held contiguously in memory, are
>         changed to use the new APIs.
>     - Add support for multi-segment mbufs to dp_packet_equal() (patch 06/14);
>           - Fix a bug in patch 08/14 where the call to dp_packet_copy_mbuf_flags() in
>                   dp_packet_clone_with_headroom() was setting incorrectly the nb_segs field
>       on the destination mbuf;
>     - Add unit-tests for dp_packet_equal() and (new) dp_packet_linear_data() to
>             patch 12/14;
>     - Add a comment to jumbo-frames.rst under topics/dpdk/ to warn how
>             multi-segments mbufs may affect performance when using large packets
>       across DPDK and non-DPDK ports.
> 
> v7: - Rebase on master 024810c ("Prepare for post-2.10.0 (2.10.90).");
>     - Add Ben's proposed fix for automake's warning;
>           - Add a note to cover letter to explain this is preperatory work for TSO /
>                   GRO.
> 
> v6: - Rebase on master d1b235d ("tests: Add test for ovn-nbctl's command parser
>       error paths.");
>     - Address Darrell's comments:
>             - The changes in dp_packet_resize__() were trying to alleviate the call
>                       to OVS_NOT_REACHED() for DPDK packets, by trying to reuse the available
>         tailroom space when no more headroom space is available, and vice-versa.
>         However, this was breaking the API for the dp_packet_resize__()
>         function (only called in dp_packet_prealloc_tailroom() and
>         dp_packet_prealloc_headroom()), which doesn't seem to suit the purposes
>         for DPDK packets.
>         Instead, and because this is isolate funtionality, revert to the
>         previous state where dp_packet_resize__() is not supported for DPDK
>         packets. Hence, then patch 08/14 has been dropped.
>     - Additionally, fix the tests that were relying on the removed
>             functionality.
> 
> v5: - Rebase on master 030958a0cc ("conntrack: Fix conn_update_state_alg use
>       after free.");
>     - Address Eelco's comments:
>             - Remove dpdk_mp_sweep() call in netdev_dpdk_mempool_configure(), a
>                       leftover from rebase. Only call should be in dpdk_mp_get();
>       - Remove NEWS line added by mistake during rebase (about adding
>                 experimental vhost zero copy support).
>     - Address Ian's comments:
>             - Drop patch 01 from previous series entirely;
>                     - Patch (now) 01/14 adds a new call to dpdk_buf_size() inside
>                               dpdk_mp_create() to get the correct "mbuf_size" to be used;
>       - Patch (now) 11/14 modifies dpdk_mp_create() to check if multi-segment
>                 mbufs is enabled, in which case it calculates the new "mbuf_size" to be
>         used;
>       - In free_dpdk_buf() and dpdk_buf_alloc(), don't lock and unlock
>                 conditionally.
>     - Add "per-port-memory=true" to test "Multi-segment mbufs Tx" as the current
>             DPDK set up in system-dpdk-testsuite can't handle higher MTU sizes using
>       the shared mempool model (runs out of memory);
>     - Add new examples for when multi-segment mbufs are enabled in
>             topics/dpdk/memory.rst, and a reference to topics/dpdk/jumbo-frames.rst
>       (now patch 11/14).
> 
> v4: - Rebase on master b22fb00 ("ovn-nbctl: Clarify error messages in qos-add
>       command."):
>       - A new patch (now 01/15) has been introduced to differentiate between
>                 MTU and mbuf size when creating the mempool. This is because as part of
>         the new support for both per port and shared mempools, mempools were
>         being reused based on the mbuf size, and for multi-segment mbufs the
>         mbuf size can end up being the same for all mbufs;
>       - A couple of other patches (now 02/15 and 12/15) have been modified as
>                 part of the rebase, but only to adapt to the code changes to "Support
>         both shared and per port mempools.", no functionality should have been
>         changed.
> 
> v3:
>     - Address Eelco's comment:
>               - Fix the ovs_assert() introduced in v2 in __packet_set_data(), which
>                           wasn't correctly asserting that the passed 'v' was smaller than the
>           first mbuf's buf_len.
> 
> v2:
>     - Rebase on master e7cd8cf ("db-ctl-base: Don't die in cmd_destroy() on
>             error.");
>     - Address Eelco's comments:
>               - In free_dpdk_buf(), use mbuf's struct address in dp_packet instead of
>                           casting;
>         - Remove unneeded variable in dp_packet_set_size(), pointing to the
>                     first mbuf in the chain;
>         - Assert in dp_packet_set_size() to enforce that "pkt_len == v" is
>                     always true for DPBUF_DPDK packets;
>         - Assert in __packet_set_data() to enforce that data_off never goes
>                     beyond the first mbuf in the chain.
> 
> v1:
>     - v8 should have been sent as v1 really, as that's usually the approach
>             followed in OvS. That clearly didn't happen, so restarting the series
>       now. This also helps making it clear it is no longer an RFC series;
>     - Rebase on master e461481 ("ofp-meter: Fix ofp_print_meter_flags()
>             output.");
>     - Address Eelco's comments:
>               - Change dp_packet_size() so that for `DPBUF_DPDK` packets their
>                           `pkt_len` and `data_len` can't be set to values bigger than the
>           available space. Also fix assigment to `data_len` which was
>           incorrectly being set to just`pkt_len`;
>         - Improve `nonpmd_mp_mutex` comment with a better explanation as to
>                     why the mutex is needed;
>         - Fix dp_packet_clear() to not call rte_pktmbuf_reset() for non
>                     `DPBUF_DPDK` packets;
>         - Dropped `if` clause in dp_packet_l4_size(), keep just the `else`;
>                   - Change dp_packet_clone_with_headroom() to use rte_pktmbuf_read() for
>                               copying `DPBUF_DPDK` packets' data. Also, change it to return
>           appropriate and meaningful errors, instead of just "0" or "1";
>         - Change dpdk_prep_tx_buf() name's to dpdk_clone_dp_packet_to_mbuf(),
>                     and reuse dp_packet_mbuf_write() instead of manual copy;
>         - Add note vswitchd/vswitch.xml to make it clear the enabling of
>                     dpdk-multi-seg-mbufs requires a restart;
>     - Change dpdk_mp_create() to increase # mbufs used under the multi-segment
>             mbufs approach;
>     - Increase test coverage by adding "end-to-end" tests that verify that
>             "dpdk-multi-seg-mbufs" is disabled by default and that a packet is
>       successfully sent out;
>     - Some helper funcs such as dp_packet_tail() and dp_packet_end() were moved
>             back to be common between `DPBUF_DPDK` and non `DPBUF_DPDK` packets, to
>       minimise changes;
>     - Add some extra notes to "Performance notes" in jumbo-frames.rst doc,
>             after further testing;
>     - Structure changes:
>             - Drop patch 07/13 which is now unneeded;
>                     - Two more patches added for extra test coverage. This is what accounts
>                               for the increase in size (+1 patch) in the series.
> 
> v8 (non-RFC): 
>     - Rebase on master 88125d6 ("rhel: remove ovs-sim man page from
>             temporary directory (also for RHEL)");
>     - Address Ciara's and Ilya's comments:
>               - Drop the dp_packet_mbuf_tail() function and use only the
>                           already existing dp_packet_tail();
>         - Fix bug in dpdk_do_tx_copy() where the error return from
>                     dpdk_prep_tx_buf() was being wrongly checked;
>         - Use dpdk_buf_alloc() and free_dpdk_buf() instead of
>                     rte_pktmbuf_alloc() and rte_pktmbuf_free();
>         - Fix some other code style and duplication issues pointed out.
>               - Refactored dp_packet_shift(), dp_packet_resize__() and
>                       dp_packet_put*() functions to work within the bounds of existing
>       mbufs only;
>     - Fix dp_packet_clear() which wasn't correctly clearing / freeing
>             other mbufs in the chain for chains with more than a single mbuf;
>     - dp_packet layer functions (such as dp_packet_l3()) now check if
>             the header is within the first mbuf, when using mbufs;
>     - Move patch 08/13 to before patch 04/13, since dp_packet_set_size()
>             was refactored to use free_dpdk_buf();
>     - Fix wrong rte_memcpy() when performing dp_packet_clone() which was
>             leading to memory corruption; 
>     - Modified the added tests to account for some of the above changes;
>           - Run performance tests, compiling results and adding them to the
>                   cover letter;
>     - Add a multi-seg mbufs explanation to the jumbo-frames.rst doc,
>             together with a "Performance notes" sub-section reflecting the
>       findings mentioned above in the cover letter.
> 
> v7:  - Rebase on master 5e720da ("erspan: fix invalid erspan version.");
>      - Address Ilya comments;
>                - Fix non-DPDK build;
>                          - Serialise the access of non pmds to allocation and free of mbufs by
>                                      using a newly introduced mutex.
>      - Add a new set of tests that integrates with the recently added DPDK
>               testsuite. These focus on allocating dp_packets, with a single or
>        multiple mbufs, from an instantiated mempool and performing several
>        operations on those, verifying if the data at the end matches what's
>        expected;
>      - Fix bugs found by the new tests:
>                - dp_packet_shift() wasn't taking into account shift lefts;
>                          - dp_packet_resize__() was misusing and miscalculating the tailrooms
>                                      and headrooms, ending up calculating the wrong number of segments
>           that needed allocation;
>         - An mbuf's end was being miscalculated both in dp_packet_tail,
>                     dp_packet_mbuf_tail() and dp_packet_end();
>         - dp_packet_set_size() was not updating the number of chained segments
>                     'nb_segs';
>      - Add support for multi-seg mbufs in dp_packet_clear().
> 
> v6:  - Rebase on master 7c0cb29 ("conntrack-tcp: Handle tcp session
>        reuse.");
>      - Further improve dp_packet_put_uninit() and dp_packet_shift() to
>               support multi-seg mbufs;
>      - Add support for multi-seg mbufs in dp_packet_l4_size() and
>               improve other helper funcs, such as dp_packet_tail() and dp_
>        packet_tailroom().
>      - Add support for multi-seg mbufs in dp_packet_put(), dp_packet_
>               put_zeros(), as well as dp_packet_resize__() - allocating new
>        mbufs and linking them together;
>      Restructured patchset:
>      - Squash patch 5 into patch 6, since they were both related to
>               copying data while handling multi-seg mbufs;
>      - Split patch 4 into two separate patches - one that introduces the
>               changes in helper functions to deal with multi-seg mbufs and
>        two others for the shift() and put_uninit() functionality;
>      - Move patch 4 to before patch 3, so that ihelper functions come
>               before functionality improvement that rely on those helpers.
> 
> v5: - Rebased on master e5e22dc ("datapath-windows: Prevent ct-counters
>       from getting redundantly incremented");
>     - Sugesh's comments have been addressed:
>             - Changed dp_packet_set_data() and dp_packet_set_size() logic to
>                       make them independent of each other;
>       - Dropped patch 3 now that dp_packet_set_data() and dp_packet_set_
>                 size() are independent;
>       - dp_packet_clone_with_headroom() now has split functions for
>                 handling DPDK sourced packets and non-DPDK packets;
>     - Modified various functions in dp-packet.h to account for multi-seg
>             mbufs - dp_packet_put_uninit(), dp_packet_tail(), dp_packet_tail()
>       and dp_packet_at();
>     - Added support for shifting packet data in multi-seg mbufs, using
>             dp_packet_shift();
>     - Fixed some minor inconsistencies.
> 
>     Note that some of the changes in v5 have been contributed by Mark
>     Kavanagh as well.
> 
> v4: - restructure patchset
>     - account for 128B ARM cacheline when sizing mbufs
> 
> Artur Twardowski (1):
>   dp-packet: Fix invalid size of ICMPv6 header
> 
> Mark Kavanagh (2):
>   netdev-dpdk: copy large packet to multi-seg. mbufs
>   netdev-dpdk: support multi-segment jumbo frames.
> 
> Michael Qiu (1):
>   dp-packet: copy data from multi-seg. DPDK mbuf
> 
> Michal Obrembski (3):
>   dpdk-tests: Fix Multi-segment DPDK Unittests
>   Fix build without DPDK
>   Fix DPDK MBUF tests compilation on some compilers
> 
> Tiago Lam (8):
>   netdev-dpdk: Serialise non-pmds mbufs' alloc/free.
>   dp-packet: Fix data_len handling multi-seg mbufs.
>   dp-packet: Handle multi-seg mbufs in helper funcs.
>   dp-packet: Handle multi-seg mubfs in shift() func.
>   dp-packet: Add support for data "linearization".
>   dpdk-tests: Add unit-tests for multi-seg mbufs.
>   dpdk-tests: Accept other configs in OVS_DPDK_START
>   dpdk-tests: End-to-end tests for multi-seg mbufs.
> 
>  Documentation/topics/dpdk/jumbo-frames.rst |  73 +++
>  Documentation/topics/dpdk/memory.rst       |  36 ++
>  NEWS                                       |   1 +
>  lib/conntrack.c                            |   5 +
>  lib/crc32c.c                               |  17 +-
>  lib/crc32c.h                               |   2 +
>  lib/dp-packet.c                            | 168 ++++++-
>  lib/dp-packet.h                            | 497 +++++++++++++++++---
>  lib/dpdk.c                                 |   8 +
>  lib/dpif-netdev.c                          |  18 +-
>  lib/dpif-netlink.c                         |   3 +
>  lib/dpif.c                                 |   6 +
>  lib/flow.c                                 | 110 ++++-
>  lib/flow.h                                 |   4 +-
>  lib/mcast-snooping.c                       |   2 +
>  lib/netdev-bsd.c                           |   3 +
>  lib/netdev-dpdk.c                          | 189 +++++++-
>  lib/netdev-dpdk.h                          |   1 +
>  lib/netdev-dummy.c                         |   6 +
>  lib/netdev-linux.c                         |   6 +
>  lib/netdev-native-tnl.c                    |  26 +-
>  lib/odp-execute.c                          |  24 +-
>  lib/packets.c                              |  96 +++-
>  lib/packets.h                              |   7 +
>  ofproto/ofproto-dpif-upcall.c              |  21 +-
>  ofproto/ofproto-dpif-xlate.c               |  27 +-
>  tests/automake.mk                          |  10 +-
>  tests/dpdk-packet-mbufs.at                 |   7 +
>  tests/system-dpdk-macros.at                |   7 +-
>  tests/system-dpdk-testsuite.at             |   1 +
>  tests/system-dpdk.at                       |  67 +++
>  tests/test-dpdk-mbufs.c                    | 722 +++++++++++++++++++++++++++++
>  tests/test-rstp.c                          |   9 +-
>  tests/test-stp.c                           |   9 +-
>  vswitchd/vswitch.xml                       |  22 +
>  35 files changed, 2064 insertions(+), 146 deletions(-)
>  create mode 100644 tests/dpdk-packet-mbufs.at
>  create mode 100644 tests/test-dpdk-mbufs.c
> 
> -- 
> 2.7.4
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev