diff mbox series

[ovs-dev] 答复: [openvswitch.org代发] [PATCH v2] netdev-afxdp: Best-effort configuration of XDP mode.

Message ID 18fb68b235f249d089f01c57e268a422@inspur.com
State Not Applicable, archived
Headers show
Series [ovs-dev] 答复: [openvswitch.org代发] [PATCH v2] netdev-afxdp: Best-effort configuration of XDP mode. | expand

Commit Message

Yi Yang (杨燚)-云服务集团 Nov. 19, 2019, 9 a.m. UTC
Hi, Ilya

Can you explain what kernel limitations are for TCP for veth? I can't
understand why veth has such limitations only for TCP. I saw a veth bug
(https://tech.vijayp.ca/linux-kernel-bug-delivers-corrupt-tcp-ip-data-to-mes
os-kubernetes-docker-containers-4986f88f7a19) but it has been fixed in 2016.

-----邮件原件-----
发件人: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-bounces@openvswitch.
org] 代表 Ilya Maximets
发送时间: 2019年11月7日 19:37
收件人: ovs-dev@openvswitch.org
抄送: Ilya Maximets <i.maximets@ovn.org>
主题: [openvswitch.org代发][ovs-dev] [PATCH v2] netdev-afxdp: Best-effort
configuration of XDP mode.

Until now there was only two options for XDP mode in OVS: SKB or DRV.
i.e. 'generic XDP' or 'native XDP with zero-copy enabled'.

Devices like 'veth' interfaces in Linux supports native XDP, but doesn't
support zero-copy mode.  This case can not be covered by existing API and we
have to use slower generic XDP for such devices.
There are few more issues, e.g. TCP is not supported in generic XDP mode for
veth interfaces due to kernel limitations, however it is supported in native
mode.

This change introduces ability to use native XDP without zero-copy along
with best-effort configuration option that enabled by default.
In best-effort case OVS will sequentially try different modes starting from
the fastest one and will choose the first acceptable for current interface.
This will guarantee the best possible performance.

If user will want to choose specific mode, it's still possible by setting
the 'options:xdp-mode'.

This change additionally changes the API by renaming the configuration knob
from 'xdpmode' to 'xdp-mode' and also renaming the modes themselves to be
more user-friendly.

The full list of currently supported modes:
  * native-with-zerocopy - former DRV
  * native               - new one, DRV without zero-copy
  * generic              - former SKB
  * best-effort          - new one, chooses the best available from
                           3 above modes

Since 'best-effort' is a default mode, users will not need to explicitely
set 'xdp-mode' in most cases.

TCP related tests enabled back in system afxdp testsuite, because
'best-effort' will choose 'native' mode for veth interfaces and this mode
has no issues with TCP.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
---

With this patch I modified the user-visible API, but I think it's OK since
it's still an experimental netdev.  Comments are welcome.

Version 2:
  * Rebased on current master.

 Documentation/intro/install/afxdp.rst |  54 ++++---
 NEWS                                  |  12 +-
 lib/netdev-afxdp.c                    | 223 ++++++++++++++++----------
 lib/netdev-afxdp.h                    |   9 ++
 lib/netdev-linux-private.h            |   8 +-
 tests/system-afxdp-macros.at          |   7 -
 vswitchd/vswitch.xml                  |  38 +++--
 7 files changed, 227 insertions(+), 124 deletions(-)

+            <code>best-effort</code> tries to detect and choose the best
+            (fastest) from the available modes for current interface.
+          </p>
+          <p>
+            Note that this option is specific to netdev-afxdp.
+            Defaults to <code>best-effort</code> mode.
+          </p>
         </p>
       </column>
 
--
2.17.1

Comments

Ilya Maximets Nov. 19, 2019, 11:54 a.m. UTC | #1
On 19.11.2019 10:00, Yi Yang (杨燚)-云服务集团 wrote:
> Hi, Ilya
> 
> Can you explain what kernel limitations are for TCP for veth? I can't
> understand why veth has such limitations only for TCP. I saw a veth bug
> (https://tech.vijayp.ca/linux-kernel-bug-delivers-corrupt-tcp-ip-data-to-mes
> os-kubernetes-docker-containers-4986f88f7a19) but it has been fixed in 2016.

Hi.

Have you read the issue referenced in docs:
https://github.com/cilium/cilium/issues/3077
?

In short, TCP stack clones the packets and netif_receive_generic_xdp()
drops all the cloned packets. Native XDP for veth seems to work fine.

Best regards, Ilya Maximets.
Yi Yang (杨燚)-云服务集团 Nov. 20, 2019, 12:12 a.m. UTC | #2
Ilya, got it, thanks a lot.

-----邮件原件-----
发件人: Ilya Maximets [mailto:i.maximets@ovn.org] 
发送时间: 2019年11月19日 19:55
收件人: Yi Yang (杨燚)-云服务集团 <yangyi01@inspur.com>; i.maximets@ovn.org; ovs-dev@openvswitch.org
主题: Re: 答复: [openvswitch.org代发][ovs-dev] [PATCH v2] netdev-afxdp: Best-effort configuration of XDP mode.

On 19.11.2019 10:00, Yi Yang (杨燚)-云服务集团 wrote:
> Hi, Ilya
> 
> Can you explain what kernel limitations are for TCP for veth? I can't 
> understand why veth has such limitations only for TCP. I saw a veth 
> bug 
> (https://tech.vijayp.ca/linux-kernel-bug-delivers-corrupt-tcp-ip-data-
> to-mes
> os-kubernetes-docker-containers-4986f88f7a19) but it has been fixed in 2016.

Hi.

Have you read the issue referenced in docs:
https://github.com/cilium/cilium/issues/3077
?

In short, TCP stack clones the packets and netif_receive_generic_xdp() drops all the cloned packets. Native XDP for veth seems to work fine.

Best regards, Ilya Maximets.
diff mbox series

Patch

diff --git a/Documentation/intro/install/afxdp.rst
b/Documentation/intro/install/afxdp.rst
index a136db0c9..937770ad0 100644
--- a/Documentation/intro/install/afxdp.rst
+++ b/Documentation/intro/install/afxdp.rst
@@ -153,9 +153,8 @@  To kick start end-to-end autotesting::
   make check-afxdp TESTSUITEFLAGS='1'
 
 .. note::
-   Not all test cases pass at this time. Currenly all TCP related
-   tests, ex: using wget or http, are skipped due to XDP limitations
-   on veth. cvlan test is also skipped.
+   Not all test cases pass at this time. Currenly all cvlan tests are
skipped
+   due to kernel issues.
 
 If a test case fails, check the log at::
 
@@ -177,33 +176,35 @@  in :doc:`general`::
   ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev
 
 Make sure your device driver support AF_XDP, netdev-afxdp supports -the
following additional options (see man ovs-vswitchd.conf.db for
+the following additional options (see ``man ovs-vswitchd.conf.db`` for
 more details):
 
- * **xdpmode**: use "drv" for driver mode, or "skb" for skb mode.
+ * ``xdp-mode``: ``best-effort``, ``native-with-zerocopy``,
+   ``native`` or ``generic``.  Defaults to ``best-effort``, i.e. best of
+   supported modes, so in most cases you don't need to change it.
 
- * **use-need-wakeup**: default "true" if libbpf supports it, otherwise
false.
+ * ``use-need-wakeup``: default ``true`` if libbpf supports it,
+   otherwise ``false``.
 
 For example, to use 1 PMD (on core 4) on 1 queue (queue 0) device,
-configure these options: **pmd-cpu-mask, pmd-rxq-affinity, and n_rxq**.
-The **xdpmode** can be "drv" or "skb"::
+configure these options: ``pmd-cpu-mask``, ``pmd-rxq-affinity``, and
+``n_rxq``::
 
   ethtool -L enp2s0 combined 1
   ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x10
   ovs-vsctl add-port br0 enp2s0 -- set interface enp2s0 type="afxdp" \
-    options:n_rxq=1 options:xdpmode=drv \
-    other_config:pmd-rxq-affinity="0:4"
+                                   other_config:pmd-rxq-affinity="0:4"
 
 Or, use 4 pmds/cores and 4 queues by doing::
 
   ethtool -L enp2s0 combined 4
   ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x36
   ovs-vsctl add-port br0 enp2s0 -- set interface enp2s0 type="afxdp" \
-    options:n_rxq=4 options:xdpmode=drv \
-    other_config:pmd-rxq-affinity="0:1,1:2,2:3,3:4"
+    options:n_rxq=4 other_config:pmd-rxq-affinity="0:1,1:2,2:3,3:4"
 
 .. note::
-   pmd-rxq-affinity is optional. If not specified, system will auto-assign.
+   ``pmd-rxq-affinity`` is optional. If not specified, system will
auto-assign.
+   ``n_rxq`` equals ``1`` by default.
 
 To validate that the bridge has successfully instantiated, you can use
the::
 
@@ -214,12 +215,21 @@  Should show something like::
   Port "ens802f0"
    Interface "ens802f0"
       type: afxdp
-      options: {n_rxq="1", xdpmode=drv}
+      options: {n_rxq="1"}
 
 Otherwise, enable debugging by::
 
   ovs-appctl vlog/set netdev_afxdp::dbg
 
+To check which XDP mode was chosen by ``best-effort``, you can look for 
+``xdp-mode-in-use`` in the output of ``ovs-appctl dpctl/show``::
+
+  # ovs-appctl dpctl/show
+  netdev@ovs-netdev:
+    <...>
+    port 2: ens802f0 (afxdp: n_rxq=1, use-need-wakeup=true,
+                      xdp-mode=best-effort,
+                      xdp-mode-in-use=native-with-zerocopy)
 
 References
 ----------
@@ -323,8 +333,11 @@  Limitations/Known Issues  #. Most of the tests are done
using i40e single port. Multiple ports and
    also ixgbe driver also needs to be tested.
 #. No latency test result (TODO items)
-#. Due to limitations of current upstream kernel, TCP and various
offloading
+#. Due to limitations of current upstream kernel, various offloading
    (vlan, cvlan) is not working over virtual interfaces (i.e. veth pair).
+   Also, TCP is not working over virtual interfaces in generic XDP mode.
+   Some more information and possible workaround available `here
+   <https://github.com/cilium/cilium/issues/3077#issuecomment-430801467>`__
.
 
 
 PVP using tap device
@@ -335,8 +348,7 @@  First, start OVS, then add physical port::
   ethtool -L enp2s0 combined 1
   ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x10
   ovs-vsctl add-port br0 enp2s0 -- set interface enp2s0 type="afxdp" \
-    options:n_rxq=1 options:xdpmode=drv \
-    other_config:pmd-rxq-affinity="0:4"
+    options:n_rxq=1 other_config:pmd-rxq-affinity="0:4"
 
 Start a VM with virtio and tap device::
 
@@ -414,13 +426,11 @@  Create namespace and veth peer devices::
 
 Attach the veth port to br0 (linux kernel mode)::
 
-  ovs-vsctl add-port br0 afxdp-p0 -- \
-    set interface afxdp-p0 options:n_rxq=1
+  ovs-vsctl add-port br0 afxdp-p0 -- set interface afxdp-p0
 
-Or, use AF_XDP with skb mode::
+Or, use AF_XDP::
 
-  ovs-vsctl add-port br0 afxdp-p0 -- \
-    set interface afxdp-p0 type="afxdp" options:n_rxq=1 options:xdpmode=skb
+  ovs-vsctl add-port br0 afxdp-p0 -- set interface afxdp-p0 type="afxdp"
 
 Setup the OpenFlow rules::
 
diff --git a/NEWS b/NEWS
index 88b818948..d5f476d6e 100644
--- a/NEWS
+++ b/NEWS
@@ -5,11 +5,19 @@  Post-v2.12.0
        separate project. You can find it at
        https://github.com/ovn-org/ovn.git
    - Userspace datapath:
+     * Add option to enable, disable and query TCP sequence checking in
+       conntrack.
+   - AF_XDP:
      * New option 'use-need-wakeup' for netdev-afxdp to control enabling
        of corresponding 'need_wakeup' flag in AF_XDP rings.  Enabled by
default
        if supported by libbpf.
-     * Add option to enable, disable and query TCP sequence checking in
-       conntrack.
+     * 'xdpmode' option for netdev-afxdp renamed to 'xdp-mode'.
+       Modes also updated.  New values:
+         native-with-zerocopy  - former DRV
+         native                - new one, DRV without zero-copy
+         generic               - former SKB
+         best-effort [default] - new one, chooses the best available from
+                                 3 above modes
 
 v2.12.0 - 03 Sep 2019
 ---------------------
diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c index
af654d498..74dde219d 100644
--- a/lib/netdev-afxdp.c
+++ b/lib/netdev-afxdp.c
@@ -89,12 +89,42 @@  BUILD_ASSERT_DECL(PROD_NUM_DESCS == CONS_NUM_DESCS);
#define UMEM2DESC(elem, base) ((uint64_t)((char *)elem - (char *)base))
 
 static struct xsk_socket_info *xsk_configure(int ifindex, int xdp_queue_id,
-                                             int mode, bool
use_need_wakeup);
-static void xsk_remove_xdp_program(uint32_t ifindex, int xdpmode);
+                                             enum afxdp_mode mode,
+                                             bool use_need_wakeup,
+                                             bool 
+report_socket_failures); static void xsk_remove_xdp_program(uint32_t 
+ifindex, enum afxdp_mode);
 static void xsk_destroy(struct xsk_socket_info *xsk);  static int
xsk_configure_all(struct netdev *netdev);  static void
xsk_destroy_all(struct netdev *netdev);
 
+static struct {
+    const char *name;
+    uint32_t bind_flags;
+    uint32_t xdp_flags;
+} xdp_modes[] = {
+    [OVS_AF_XDP_MODE_UNSPEC] = {
+        .name = "unspecified", .bind_flags = 0, .xdp_flags = 0,
+    },
+    [OVS_AF_XDP_MODE_BEST_EFFORT] = {
+        .name = "best-effort", .bind_flags = 0, .xdp_flags = 0,
+    },
+    [OVS_AF_XDP_MODE_NATIVE_ZC] = {
+        .name = "native-with-zerocopy",
+        .bind_flags = XDP_ZEROCOPY,
+        .xdp_flags = XDP_FLAGS_DRV_MODE,
+    },
+    [OVS_AF_XDP_MODE_NATIVE] = {
+        .name = "native",
+        .bind_flags = XDP_COPY,
+        .xdp_flags = XDP_FLAGS_DRV_MODE,
+    },
+    [OVS_AF_XDP_MODE_GENERIC] = {
+        .name = "generic",
+        .bind_flags = XDP_COPY,
+        .xdp_flags = XDP_FLAGS_SKB_MODE,
+    },
+};
+
 struct unused_pool {
     struct xsk_umem_info *umem_info;
     int lost_in_rings; /* Number of packets left in tx, rx, cq and fq. */
@@ -214,7 +244,7 @@  netdev_afxdp_sweep_unused_pools(void *aux OVS_UNUSED)  }
 
 static struct xsk_umem_info *
-xsk_configure_umem(void *buffer, uint64_t size, int xdpmode)
+xsk_configure_umem(void *buffer, uint64_t size)
 {
     struct xsk_umem_config uconfig;
     struct xsk_umem_info *umem;
@@ -232,9 +262,7 @@  xsk_configure_umem(void *buffer, uint64_t size, int
xdpmode)
     ret = xsk_umem__create(&umem->umem, buffer, size, &umem->fq, &umem->cq,
                            &uconfig);
     if (ret) {
-        VLOG_ERR("xsk_umem__create failed (%s) mode: %s",
-                 ovs_strerror(errno),
-                 xdpmode == XDP_COPY ? "SKB": "DRV");
+        VLOG_ERR("xsk_umem__create failed: %s.", ovs_strerror(errno));
         free(umem);
         return NULL;
     }
@@ -290,7 +318,8 @@  xsk_configure_umem(void *buffer, uint64_t size, int
xdpmode)
 
 static struct xsk_socket_info *
 xsk_configure_socket(struct xsk_umem_info *umem, uint32_t ifindex,
-                     uint32_t queue_id, int xdpmode, bool use_need_wakeup)
+                     uint32_t queue_id, enum afxdp_mode mode,
+                     bool use_need_wakeup, bool report_socket_failures)
 {
     struct xsk_socket_config cfg;
     struct xsk_socket_info *xsk;
@@ -304,14 +333,8 @@  xsk_configure_socket(struct xsk_umem_info *umem,
uint32_t ifindex,
     cfg.rx_size = CONS_NUM_DESCS;
     cfg.tx_size = PROD_NUM_DESCS;
     cfg.libbpf_flags = 0;
-
-    if (xdpmode == XDP_ZEROCOPY) {
-        cfg.bind_flags = XDP_ZEROCOPY;
-        cfg.xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST | XDP_FLAGS_DRV_MODE;
-    } else {
-        cfg.bind_flags = XDP_COPY;
-        cfg.xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST | XDP_FLAGS_SKB_MODE;
-    }
+    cfg.bind_flags = xdp_modes[mode].bind_flags;
+    cfg.xdp_flags = xdp_modes[mode].xdp_flags | 
+ XDP_FLAGS_UPDATE_IF_NOEXIST;
 
 #ifdef HAVE_XDP_NEED_WAKEUP
     if (use_need_wakeup) {
@@ -329,12 +352,11 @@  xsk_configure_socket(struct xsk_umem_info *umem,
uint32_t ifindex,
     ret = xsk_socket__create(&xsk->xsk, devname, queue_id, umem->umem,
                              &xsk->rx, &xsk->tx, &cfg);
     if (ret) {
-        VLOG_ERR("xsk_socket__create failed (%s) mode: %s "
-                 "use-need-wakeup: %s qid: %d",
-                 ovs_strerror(errno),
-                 xdpmode == XDP_COPY ? "SKB": "DRV",
-                 use_need_wakeup ? "true" : "false",
-                 queue_id);
+        VLOG(report_socket_failures ? VLL_ERR : VLL_DBG,
+             "xsk_socket__create failed (%s) mode: %s, "
+             "use-need-wakeup: %s, qid: %d",
+             ovs_strerror(errno), xdp_modes[mode].name,
+             use_need_wakeup ? "true" : "false", queue_id);
         free(xsk);
         return NULL;
     }
@@ -375,8 +397,8 @@  xsk_configure_socket(struct xsk_umem_info *umem,
uint32_t ifindex,  }
 
 static struct xsk_socket_info *
-xsk_configure(int ifindex, int xdp_queue_id, int xdpmode,
-              bool use_need_wakeup)
+xsk_configure(int ifindex, int xdp_queue_id, enum afxdp_mode mode,
+              bool use_need_wakeup, bool report_socket_failures)
 {
     struct xsk_socket_info *xsk;
     struct xsk_umem_info *umem;
@@ -389,9 +411,7 @@  xsk_configure(int ifindex, int xdp_queue_id, int
xdpmode,
     memset(bufs, 0, NUM_FRAMES * FRAME_SIZE);
 
     /* Create AF_XDP socket. */
-    umem = xsk_configure_umem(bufs,
-                              NUM_FRAMES * FRAME_SIZE,
-                              xdpmode);
+    umem = xsk_configure_umem(bufs, NUM_FRAMES * FRAME_SIZE);
     if (!umem) {
         free_pagealign(bufs);
         return NULL;
@@ -399,8 +419,8 @@  xsk_configure(int ifindex, int xdp_queue_id, int
xdpmode,
 
     VLOG_DBG("Allocated umem pool at 0x%"PRIxPTR, (uintptr_t) umem);
 
-    xsk = xsk_configure_socket(umem, ifindex, xdp_queue_id, xdpmode,
-                               use_need_wakeup);
+    xsk = xsk_configure_socket(umem, ifindex, xdp_queue_id, mode,
+                               use_need_wakeup, 
+ report_socket_failures);
     if (!xsk) {
         /* Clean up umem and xpacket pool. */
         if (xsk_umem__delete(umem->umem)) { @@ -414,12 +434,38 @@
xsk_configure(int ifindex, int xdp_queue_id, int xdpmode,
     return xsk;
 }
 
+static int
+xsk_configure_queue(struct netdev_linux *dev, int ifindex, int queue_id,
+                    enum afxdp_mode mode, bool report_socket_failures) 
+{
+    struct xsk_socket_info *xsk_info;
+
+    VLOG_DBG("%s: configuring queue: %d, mode: %s, use-need-wakeup: %s.",
+             netdev_get_name(&dev->up), queue_id, xdp_modes[mode].name,
+             dev->use_need_wakeup ? "true" : "false");
+    xsk_info = xsk_configure(ifindex, queue_id, mode, dev->use_need_wakeup,
+                             report_socket_failures);
+    if (!xsk_info) {
+        VLOG(report_socket_failures ? VLL_ERR : VLL_DBG,
+             "%s: Failed to create AF_XDP socket on queue %d in %s mode.",
+             netdev_get_name(&dev->up), queue_id, xdp_modes[mode].name);
+        dev->xsks[queue_id] = NULL;
+        return -1;
+    }
+    dev->xsks[queue_id] = xsk_info;
+    atomic_init(&xsk_info->tx_dropped, 0);
+    xsk_info->outstanding_tx = 0;
+    xsk_info->available_rx = PROD_NUM_DESCS;
+    return 0;
+}
+
+
 static int
 xsk_configure_all(struct netdev *netdev)  {
     struct netdev_linux *dev = netdev_linux_cast(netdev);
-    struct xsk_socket_info *xsk_info;
     int i, ifindex, n_rxq, n_txq;
+    int qid = 0;
 
     ifindex = linux_get_ifindex(netdev_get_name(netdev));
 
@@ -429,23 +475,36 @@  xsk_configure_all(struct netdev *netdev)
     n_rxq = netdev_n_rxq(netdev);
     dev->xsks = xcalloc(n_rxq, sizeof *dev->xsks);
 
-    /* Configure each queue. */
-    for (i = 0; i < n_rxq; i++) {
-        VLOG_DBG("%s: configure queue %d mode %s use-need-wakeup %s.",
-                 netdev_get_name(netdev), i,
-                 dev->xdpmode == XDP_COPY ? "SKB" : "DRV",
-                 dev->use_need_wakeup ? "true" : "false");
-        xsk_info = xsk_configure(ifindex, i, dev->xdpmode,
-                                 dev->use_need_wakeup);
-        if (!xsk_info) {
-            VLOG_ERR("Failed to create AF_XDP socket on queue %d.", i);
-            dev->xsks[i] = NULL;
+    if (dev->xdp_mode == OVS_AF_XDP_MODE_BEST_EFFORT) {
+        /* Trying to configure first queue with different modes to
+         * find the most suitable. */
+        for (i = OVS_AF_XDP_MODE_NATIVE_ZC; i < OVS_AF_XDP_MODE_MAX; i++) {
+            if (!xsk_configure_queue(dev, ifindex, qid, i,
+                                     i == OVS_AF_XDP_MODE_MAX - 1)) {
+                dev->xdp_mode_in_use = i;
+                VLOG_INFO("%s: %s XDP mode will be in use.",
+                          netdev_get_name(netdev), xdp_modes[i].name);
+                break;
+            }
+        }
+        if (i == OVS_AF_XDP_MODE_MAX) {
+            VLOG_ERR("%s: Failed to detect suitable XDP mode.",
+                     netdev_get_name(netdev));
+            goto err;
+        }
+        qid++;
+    } else {
+        dev->xdp_mode_in_use = dev->xdp_mode;
+    }
+
+    /* Configure remaining queues. */
+    for (; qid < n_rxq; qid++) {
+        if (xsk_configure_queue(dev, ifindex, qid,
+                                dev->xdp_mode_in_use, true)) {
+            VLOG_ERR("%s: Failed to create AF_XDP socket on queue %d.",
+                     netdev_get_name(netdev), qid);
             goto err;
         }
-        dev->xsks[i] = xsk_info;
-        atomic_init(&xsk_info->tx_dropped, 0);
-        xsk_info->outstanding_tx = 0;
-        xsk_info->available_rx = PROD_NUM_DESCS;
     }
 
     n_txq = netdev_n_txq(netdev);
@@ -500,7 +559,7 @@  xsk_destroy_all(struct netdev *netdev)
             if (dev->xsks[i]) {
                 xsk_destroy(dev->xsks[i]);
                 dev->xsks[i] = NULL;
-                VLOG_INFO("Destroyed xsk[%d].", i);
+                VLOG_DBG("%s: Destroyed xsk[%d].", 
+ netdev_get_name(netdev), i);
             }
         }
 
@@ -510,7 +569,7 @@  xsk_destroy_all(struct netdev *netdev)
 
     VLOG_INFO("%s: Removing xdp program.", netdev_get_name(netdev));
     ifindex = linux_get_ifindex(netdev_get_name(netdev));
-    xsk_remove_xdp_program(ifindex, dev->xdpmode);
+    xsk_remove_xdp_program(ifindex, dev->xdp_mode_in_use);
 
     if (dev->tx_locks) {
         for (i = 0; i < netdev_n_txq(netdev); i++) { @@ -526,9 +585,10 @@
netdev_afxdp_set_config(struct netdev *netdev, const struct smap *args,
                         char **errp OVS_UNUSED)  {
     struct netdev_linux *dev = netdev_linux_cast(netdev);
-    const char *str_xdpmode;
-    int xdpmode, new_n_rxq;
+    const char *str_xdp_mode;
+    enum afxdp_mode xdp_mode;
     bool need_wakeup;
+    int new_n_rxq;
 
     ovs_mutex_lock(&dev->mutex);
     new_n_rxq = MAX(smap_get_int(args, "n_rxq", NR_QUEUE), 1); @@ -539,14
+599,17 @@ netdev_afxdp_set_config(struct netdev *netdev, const struct smap
*args,
         return EINVAL;
     }
 
-    str_xdpmode = smap_get_def(args, "xdpmode", "skb");
-    if (!strcasecmp(str_xdpmode, "drv")) {
-        xdpmode = XDP_ZEROCOPY;
-    } else if (!strcasecmp(str_xdpmode, "skb")) {
-        xdpmode = XDP_COPY;
-    } else {
-        VLOG_ERR("%s: Incorrect xdpmode (%s).",
-                 netdev_get_name(netdev), str_xdpmode);
+    str_xdp_mode = smap_get_def(args, "xdp-mode", "best-effort");
+    for (xdp_mode = OVS_AF_XDP_MODE_BEST_EFFORT;
+         xdp_mode < OVS_AF_XDP_MODE_MAX;
+         xdp_mode++) {
+        if (!strcasecmp(str_xdp_mode, xdp_modes[xdp_mode].name)) {
+            break;
+        }
+    }
+    if (xdp_mode == OVS_AF_XDP_MODE_MAX) {
+        VLOG_ERR("%s: Incorrect xdp-mode (%s).",
+                 netdev_get_name(netdev), str_xdp_mode);
         ovs_mutex_unlock(&dev->mutex);
         return EINVAL;
     }
@@ -560,10 +623,10 @@  netdev_afxdp_set_config(struct netdev *netdev, const
struct smap *args,  #endif
 
     if (dev->requested_n_rxq != new_n_rxq
-        || dev->requested_xdpmode != xdpmode
+        || dev->requested_xdp_mode != xdp_mode
         || dev->requested_need_wakeup != need_wakeup) {
         dev->requested_n_rxq = new_n_rxq;
-        dev->requested_xdpmode = xdpmode;
+        dev->requested_xdp_mode = xdp_mode;
         dev->requested_need_wakeup = need_wakeup;
         netdev_request_reconfigure(netdev);
     }
@@ -578,8 +641,9 @@  netdev_afxdp_get_config(const struct netdev *netdev,
struct smap *args)
 
     ovs_mutex_lock(&dev->mutex);
     smap_add_format(args, "n_rxq", "%d", netdev->n_rxq);
-    smap_add_format(args, "xdpmode", "%s",
-                    dev->xdpmode == XDP_ZEROCOPY ? "drv" : "skb");
+    smap_add_format(args, "xdp-mode", "%s", xdp_modes[dev->xdp_mode].name);
+    smap_add_format(args, "xdp-mode-in-use", "%s",
+                    xdp_modes[dev->xdp_mode_in_use].name);
     smap_add_format(args, "use-need-wakeup", "%s",
                     dev->use_need_wakeup ? "true" : "false");
     ovs_mutex_unlock(&dev->mutex);
@@ -596,7 +660,7 @@  netdev_afxdp_reconfigure(struct netdev *netdev)
     ovs_mutex_lock(&dev->mutex);
 
     if (netdev->n_rxq == dev->requested_n_rxq
-        && dev->xdpmode == dev->requested_xdpmode
+        && dev->xdp_mode == dev->requested_xdp_mode
         && dev->use_need_wakeup == dev->requested_need_wakeup
         && dev->xsks) {
         goto out;
@@ -607,9 +671,9 @@  netdev_afxdp_reconfigure(struct netdev *netdev)
     netdev->n_rxq = dev->requested_n_rxq;
     netdev->n_txq = netdev->n_rxq;
 
-    dev->xdpmode = dev->requested_xdpmode;
+    dev->xdp_mode = dev->requested_xdp_mode;
     VLOG_INFO("%s: Setting XDP mode to %s.", netdev_get_name(netdev),
-              dev->xdpmode == XDP_ZEROCOPY ? "DRV" : "SKB");
+              xdp_modes[dev->xdp_mode].name);
 
     if (setrlimit(RLIMIT_MEMLOCK, &r)) {
         VLOG_ERR("setrlimit(RLIMIT_MEMLOCK) failed: %s",
ovs_strerror(errno)); @@ -618,7 +682,8 @@ netdev_afxdp_reconfigure(struct
netdev *netdev)
 
     err = xsk_configure_all(netdev);
     if (err) {
-        VLOG_ERR("AF_XDP device %s reconfig failed.",
netdev_get_name(netdev));
+        VLOG_ERR("%s: AF_XDP device reconfiguration failed.",
+                 netdev_get_name(netdev));
     }
     netdev_change_seq_changed(netdev);
 out:
@@ -638,17 +703,9 @@  netdev_afxdp_get_numa_id(const struct netdev *netdev)
}
 
 static void
-xsk_remove_xdp_program(uint32_t ifindex, int xdpmode)
+xsk_remove_xdp_program(uint32_t ifindex, enum afxdp_mode mode)
 {
-    uint32_t flags;
-
-    flags = XDP_FLAGS_UPDATE_IF_NOEXIST;
-
-    if (xdpmode == XDP_COPY) {
-        flags |= XDP_FLAGS_SKB_MODE;
-    } else if (xdpmode == XDP_ZEROCOPY) {
-        flags |= XDP_FLAGS_DRV_MODE;
-    }
+    uint32_t flags = xdp_modes[mode].xdp_flags | 
+ XDP_FLAGS_UPDATE_IF_NOEXIST;
 
     bpf_set_link_xdp_fd(ifindex, -1, flags);  } @@ -662,7 +719,7 @@
signal_remove_xdp(struct netdev *netdev)
     ifindex = linux_get_ifindex(netdev_get_name(netdev));
 
     VLOG_WARN("Force removing xdp program.");
-    xsk_remove_xdp_program(ifindex, dev->xdpmode);
+    xsk_remove_xdp_program(ifindex, dev->xdp_mode_in_use);
 }
 
 static struct dp_packet_afxdp *
@@ -782,7 +839,8 @@  netdev_afxdp_rxq_recv(struct netdev_rxq *rxq_, struct
dp_packet_batch *batch,  }
 
 static inline int
-kick_tx(struct xsk_socket_info *xsk_info, int xdpmode, bool
use_need_wakeup)
+kick_tx(struct xsk_socket_info *xsk_info, enum afxdp_mode mode,
+        bool use_need_wakeup)
 {
     int ret, retries;
     static const int KERNEL_TX_BATCH_SIZE = 16; @@ -791,11 +849,11 @@
kick_tx(struct xsk_socket_info *xsk_info, int xdpmode, bool use_need_wakeup)
         return 0;
     }
 
-    /* In SKB_MODE packet transmission is synchronous, and the kernel xmits
+    /* In generic mode packet transmission is synchronous, and the 
+ kernel xmits
      * only TX_BATCH_SIZE(16) packets for a single sendmsg syscall.
      * So, we have to kick the kernel (n_packets / 16) times to be sure
that
      * all packets are transmitted. */
-    retries = (xdpmode == XDP_COPY)
+    retries = (mode == OVS_AF_XDP_MODE_GENERIC)
               ? xsk_info->outstanding_tx / KERNEL_TX_BATCH_SIZE
               : 0;
 kick_retry:
@@ -962,7 +1020,7 @@  __netdev_afxdp_batch_send(struct netdev *netdev, int
qid,
                            &orig);
         COVERAGE_INC(afxdp_tx_full);
         afxdp_complete_tx(xsk_info);
-        kick_tx(xsk_info, dev->xdpmode, dev->use_need_wakeup);
+        kick_tx(xsk_info, dev->xdp_mode_in_use, dev->use_need_wakeup);
         error = ENOMEM;
         goto out;
     }
@@ -986,7 +1044,7 @@  __netdev_afxdp_batch_send(struct netdev *netdev, int
qid,
     xsk_ring_prod__submit(&xsk_info->tx, dp_packet_batch_size(batch));
     xsk_info->outstanding_tx += dp_packet_batch_size(batch);
 
-    ret = kick_tx(xsk_info, dev->xdpmode, dev->use_need_wakeup);
+    ret = kick_tx(xsk_info, dev->xdp_mode_in_use, 
+ dev->use_need_wakeup);
     if (OVS_UNLIKELY(ret)) {
         VLOG_WARN_RL(&rl, "%s: error sending AF_XDP packet: %s.",
                      netdev_get_name(netdev), ovs_strerror(ret)); @@ -1052,
10 +1110,11 @@ netdev_afxdp_construct(struct netdev *netdev)
     /* Queues should not be used before the first reconfiguration.
Clearing. */
     netdev->n_rxq = 0;
     netdev->n_txq = 0;
-    dev->xdpmode = 0;
+    dev->xdp_mode = OVS_AF_XDP_MODE_UNSPEC;
+    dev->xdp_mode_in_use = OVS_AF_XDP_MODE_UNSPEC;
 
     dev->requested_n_rxq = NR_QUEUE;
-    dev->requested_xdpmode = XDP_COPY;
+    dev->requested_xdp_mode = OVS_AF_XDP_MODE_BEST_EFFORT;
     dev->requested_need_wakeup = NEED_WAKEUP_DEFAULT;
 
     dev->xsks = NULL;
diff --git a/lib/netdev-afxdp.h b/lib/netdev-afxdp.h index
e2f400b72..4fe861d2d 100644
--- a/lib/netdev-afxdp.h
+++ b/lib/netdev-afxdp.h
@@ -25,6 +25,15 @@ 
 /* These functions are Linux AF_XDP specific, so they should be used
directly
  * only by Linux-specific code. */
 
+enum afxdp_mode {
+    OVS_AF_XDP_MODE_UNSPEC,
+    OVS_AF_XDP_MODE_BEST_EFFORT,
+    OVS_AF_XDP_MODE_NATIVE_ZC,
+    OVS_AF_XDP_MODE_NATIVE,
+    OVS_AF_XDP_MODE_GENERIC,
+    OVS_AF_XDP_MODE_MAX,
+};
+
 struct netdev;
 struct xsk_socket_info;
 struct xdp_umem;
diff --git a/lib/netdev-linux-private.h b/lib/netdev-linux-private.h index
c14f2fb81..8873caa9d 100644
--- a/lib/netdev-linux-private.h
+++ b/lib/netdev-linux-private.h
@@ -100,10 +100,14 @@  struct netdev_linux {
     /* AF_XDP information. */
     struct xsk_socket_info **xsks;
     int requested_n_rxq;
-    int xdpmode;                /* AF_XDP running mode: driver or skb. */
-    int requested_xdpmode;
+
+    enum afxdp_mode xdp_mode;               /* Configured AF_XDP mode. */
+    enum afxdp_mode requested_xdp_mode;     /* Requested  AF_XDP mode. */
+    enum afxdp_mode xdp_mode_in_use;        /* Effective  AF_XDP mode. */
+
     bool use_need_wakeup;
     bool requested_need_wakeup;
+
     struct ovs_spin *tx_locks;  /* spin lock array for TX queues. */
#endif  }; diff --git a/tests/system-afxdp-macros.at
b/tests/system-afxdp-macros.at index f0683c0a9..5ee2ceb1a 100644
--- a/tests/system-afxdp-macros.at
+++ b/tests/system-afxdp-macros.at
@@ -30,10 +30,3 @@  m4_define([CONFIGURE_VETH_OFFLOADS],
      AT_CHECK([ethtool -K $1 txvlan off], [0], [ignore], [ignore])
     ]
 )
-
-# OVS_START_L7([namespace], [protocol]) -# -# AF_XDP doesn't work with TCP
over virtual interfaces for now.
-#
-m4_define([OVS_START_L7],
-   [AT_SKIP_IF([:])])
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index
efdfb83bb..02a68deb1 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -3107,18 +3107,38 @@  ovs-vsctl add-port br0 p0 -- set Interface p0
type=patch options:peer=p1 \
         </p>
       </column>
 
-      <column name="options" key="xdpmode"
+      <column name="options" key="xdp-mode"
               type='{"type": "string",
-                     "enum": ["set", ["skb", "drv"]]}'>
+                     "enum": ["set", ["best-effort",
"native-with-zerocopy",
+                                      "native", "generic"]]}'>
         <p>
           Specifies the operational mode of the XDP program.
-          If "drv", the XDP program is loaded into the device driver with
-          zero-copy RX and TX enabled. This mode requires device driver
with
-          AF_XDP support and has the best performance.
-          If "skb", the XDP program is using generic XDP mode in kernel
with
-          extra data copying between userspace and kernel. No device driver
-          support is needed. Note that this is afxdp netdev type only.
-          Defaults to "skb" mode.
+          <p>
+            In <code>native-with-zerocopy</code> mode the XDP program is
loaded
+            into the device driver with zero-copy RX and TX enabled.  This
mode
+            requires device driver support and has the best performance
because
+            there should be no copying of packets.
+          </p>
+          <p>
+            <code>native</code> is the same as
+            <code>native-with-zerocopy</code>, but without zero-copy
+            capability.  This requires at least one copy between kernel and
the
+            userspace. This mode also requires support from device driver.
+          </p>
+          <p>
+            In <code>generic</code> case the XDP program in kernel works
after
+            skb allocation on early stages of packet processing inside the
+            network stack.  This mode doesn't require driver support, but
has
+            much lower performance.
+          </p>
+          <p>