diff --git a/Documentation/intro/install/afxdp.rst b/Documentation/intro/install/afxdp.rst
index a136db0c9..937770ad0 100644
--- a/Documentation/intro/install/afxdp.rst
+++ b/Documentation/intro/install/afxdp.rst
@@ -153,9 +153,8 @@ To kick start end-to-end autotesting::
make check-afxdp TESTSUITEFLAGS='1'
.. note::
- Not all test cases pass at this time. Currenly all TCP related
- tests, ex: using wget or http, are skipped due to XDP limitations
- on veth. cvlan test is also skipped.
+ Not all test cases pass at this time. Currenly all cvlan tests are skipped
+ due to kernel issues.
If a test case fails, check the log at::
@@ -177,33 +176,35 @@ in :doc:`general`::
ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev
Make sure your device driver support AF_XDP, netdev-afxdp supports
-the following additional options (see man ovs-vswitchd.conf.db for
+the following additional options (see ``man ovs-vswitchd.conf.db`` for
more details):
- * **xdpmode**: use "drv" for driver mode, or "skb" for skb mode.
+ * ``xdp-mode``: ``best-effort``, ``native-with-zerocopy``,
+ ``native`` or ``generic``. Defaults to ``best-effort``, i.e. best of
+ supported modes, so in most cases you don't need to change it.
- * **use-need-wakeup**: default "true" if libbpf supports it, otherwise false.
+ * ``use-need-wakeup``: default ``true`` if libbpf supports it,
+ otherwise ``false``.
For example, to use 1 PMD (on core 4) on 1 queue (queue 0) device,
-configure these options: **pmd-cpu-mask, pmd-rxq-affinity, and n_rxq**.
-The **xdpmode** can be "drv" or "skb"::
+configure these options: ``pmd-cpu-mask``, ``pmd-rxq-affinity``, and
+``n_rxq``::
ethtool -L enp2s0 combined 1
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x10
ovs-vsctl add-port br0 enp2s0 -- set interface enp2s0 type="afxdp" \
- options:n_rxq=1 options:xdpmode=drv \
- other_config:pmd-rxq-affinity="0:4"
+ other_config:pmd-rxq-affinity="0:4"
Or, use 4 pmds/cores and 4 queues by doing::
ethtool -L enp2s0 combined 4
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x36
ovs-vsctl add-port br0 enp2s0 -- set interface enp2s0 type="afxdp" \
- options:n_rxq=4 options:xdpmode=drv \
- other_config:pmd-rxq-affinity="0:1,1:2,2:3,3:4"
+ options:n_rxq=4 other_config:pmd-rxq-affinity="0:1,1:2,2:3,3:4"
.. note::
- pmd-rxq-affinity is optional. If not specified, system will auto-assign.
+ ``pmd-rxq-affinity`` is optional. If not specified, system will auto-assign.
+ ``n_rxq`` equals ``1`` by default.
To validate that the bridge has successfully instantiated, you can use the::
@@ -214,12 +215,21 @@ Should show something like::
Port "ens802f0"
Interface "ens802f0"
type: afxdp
- options: {n_rxq="1", xdpmode=drv}
+ options: {n_rxq="1"}
Otherwise, enable debugging by::
ovs-appctl vlog/set netdev_afxdp::dbg
+To check which XDP mode was chosen by ``best-effort``, you can look for
+``xdp-mode-in-use`` in the output of ``ovs-appctl dpctl/show``::
+
+ # ovs-appctl dpctl/show
+ netdev@ovs-netdev:
+ <...>
+ port 2: ens802f0 (afxdp: n_rxq=1, use-need-wakeup=true,
+ xdp-mode=best-effort,
+ xdp-mode-in-use=native-with-zerocopy)
References
----------
@@ -323,8 +333,11 @@ Limitations/Known Issues
#. Most of the tests are done using i40e single port. Multiple ports and
also ixgbe driver also needs to be tested.
#. No latency test result (TODO items)
-#. Due to limitations of current upstream kernel, TCP and various offloading
+#. Due to limitations of current upstream kernel, various offloading
(vlan, cvlan) is not working over virtual interfaces (i.e. veth pair).
+ Also, TCP is not working over virtual interfaces in generic XDP mode.
+ Some more information and possible workaround available `here
+ `__ .
PVP using tap device
@@ -335,8 +348,7 @@ First, start OVS, then add physical port::
ethtool -L enp2s0 combined 1
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x10
ovs-vsctl add-port br0 enp2s0 -- set interface enp2s0 type="afxdp" \
- options:n_rxq=1 options:xdpmode=drv \
- other_config:pmd-rxq-affinity="0:4"
+ options:n_rxq=1 other_config:pmd-rxq-affinity="0:4"
Start a VM with virtio and tap device::
@@ -414,13 +426,11 @@ Create namespace and veth peer devices::
Attach the veth port to br0 (linux kernel mode)::
- ovs-vsctl add-port br0 afxdp-p0 -- \
- set interface afxdp-p0 options:n_rxq=1
+ ovs-vsctl add-port br0 afxdp-p0 -- set interface afxdp-p0
-Or, use AF_XDP with skb mode::
+Or, use AF_XDP::
- ovs-vsctl add-port br0 afxdp-p0 -- \
- set interface afxdp-p0 type="afxdp" options:n_rxq=1 options:xdpmode=skb
+ ovs-vsctl add-port br0 afxdp-p0 -- set interface afxdp-p0 type="afxdp"
Setup the OpenFlow rules::
diff --git a/NEWS b/NEWS
index 0d65d5a7f..100d7b6a8 100644
--- a/NEWS
+++ b/NEWS
@@ -5,11 +5,19 @@ Post-v2.12.0
separate project. You can find it at
https://github.com/ovn-org/ovn.git
- Userspace datapath:
+ * Add option to enable, disable and query TCP sequence checking in
+ conntrack.
+ - AF_XDP:
* New option 'use-need-wakeup' for netdev-afxdp to control enabling
of corresponding 'need_wakeup' flag in AF_XDP rings. Enabled by default
if supported by libbpf.
- * Add option to enable, disable and query TCP sequence checking in
- conntrack.
+ * 'xdpmode' option for netdev-afxdp renamed to 'xdp-mode'.
+ Modes also updated. New values:
+ native-with-zerocopy - former DRV
+ native - new one, DRV without zero-copy
+ generic - former SKB
+ best-effort [default] - new one, chooses the best available from
+ 3 above modes
- DPDK:
* DPDK pdump packet capture support disabled by default. New configure
option '--enable-dpdk-pdump' to enable it.
diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
index af654d498..74dde219d 100644
--- a/lib/netdev-afxdp.c
+++ b/lib/netdev-afxdp.c
@@ -89,12 +89,42 @@ BUILD_ASSERT_DECL(PROD_NUM_DESCS == CONS_NUM_DESCS);
#define UMEM2DESC(elem, base) ((uint64_t)((char *)elem - (char *)base))
static struct xsk_socket_info *xsk_configure(int ifindex, int xdp_queue_id,
- int mode, bool use_need_wakeup);
-static void xsk_remove_xdp_program(uint32_t ifindex, int xdpmode);
+ enum afxdp_mode mode,
+ bool use_need_wakeup,
+ bool report_socket_failures);
+static void xsk_remove_xdp_program(uint32_t ifindex, enum afxdp_mode);
static void xsk_destroy(struct xsk_socket_info *xsk);
static int xsk_configure_all(struct netdev *netdev);
static void xsk_destroy_all(struct netdev *netdev);
+static struct {
+ const char *name;
+ uint32_t bind_flags;
+ uint32_t xdp_flags;
+} xdp_modes[] = {
+ [OVS_AF_XDP_MODE_UNSPEC] = {
+ .name = "unspecified", .bind_flags = 0, .xdp_flags = 0,
+ },
+ [OVS_AF_XDP_MODE_BEST_EFFORT] = {
+ .name = "best-effort", .bind_flags = 0, .xdp_flags = 0,
+ },
+ [OVS_AF_XDP_MODE_NATIVE_ZC] = {
+ .name = "native-with-zerocopy",
+ .bind_flags = XDP_ZEROCOPY,
+ .xdp_flags = XDP_FLAGS_DRV_MODE,
+ },
+ [OVS_AF_XDP_MODE_NATIVE] = {
+ .name = "native",
+ .bind_flags = XDP_COPY,
+ .xdp_flags = XDP_FLAGS_DRV_MODE,
+ },
+ [OVS_AF_XDP_MODE_GENERIC] = {
+ .name = "generic",
+ .bind_flags = XDP_COPY,
+ .xdp_flags = XDP_FLAGS_SKB_MODE,
+ },
+};
+
struct unused_pool {
struct xsk_umem_info *umem_info;
int lost_in_rings; /* Number of packets left in tx, rx, cq and fq. */
@@ -214,7 +244,7 @@ netdev_afxdp_sweep_unused_pools(void *aux OVS_UNUSED)
}
static struct xsk_umem_info *
-xsk_configure_umem(void *buffer, uint64_t size, int xdpmode)
+xsk_configure_umem(void *buffer, uint64_t size)
{
struct xsk_umem_config uconfig;
struct xsk_umem_info *umem;
@@ -232,9 +262,7 @@ xsk_configure_umem(void *buffer, uint64_t size, int xdpmode)
ret = xsk_umem__create(&umem->umem, buffer, size, &umem->fq, &umem->cq,
&uconfig);
if (ret) {
- VLOG_ERR("xsk_umem__create failed (%s) mode: %s",
- ovs_strerror(errno),
- xdpmode == XDP_COPY ? "SKB": "DRV");
+ VLOG_ERR("xsk_umem__create failed: %s.", ovs_strerror(errno));
free(umem);
return NULL;
}
@@ -290,7 +318,8 @@ xsk_configure_umem(void *buffer, uint64_t size, int xdpmode)
static struct xsk_socket_info *
xsk_configure_socket(struct xsk_umem_info *umem, uint32_t ifindex,
- uint32_t queue_id, int xdpmode, bool use_need_wakeup)
+ uint32_t queue_id, enum afxdp_mode mode,
+ bool use_need_wakeup, bool report_socket_failures)
{
struct xsk_socket_config cfg;
struct xsk_socket_info *xsk;
@@ -304,14 +333,8 @@ xsk_configure_socket(struct xsk_umem_info *umem, uint32_t ifindex,
cfg.rx_size = CONS_NUM_DESCS;
cfg.tx_size = PROD_NUM_DESCS;
cfg.libbpf_flags = 0;
-
- if (xdpmode == XDP_ZEROCOPY) {
- cfg.bind_flags = XDP_ZEROCOPY;
- cfg.xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST | XDP_FLAGS_DRV_MODE;
- } else {
- cfg.bind_flags = XDP_COPY;
- cfg.xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST | XDP_FLAGS_SKB_MODE;
- }
+ cfg.bind_flags = xdp_modes[mode].bind_flags;
+ cfg.xdp_flags = xdp_modes[mode].xdp_flags | XDP_FLAGS_UPDATE_IF_NOEXIST;
#ifdef HAVE_XDP_NEED_WAKEUP
if (use_need_wakeup) {
@@ -329,12 +352,11 @@ xsk_configure_socket(struct xsk_umem_info *umem, uint32_t ifindex,
ret = xsk_socket__create(&xsk->xsk, devname, queue_id, umem->umem,
&xsk->rx, &xsk->tx, &cfg);
if (ret) {
- VLOG_ERR("xsk_socket__create failed (%s) mode: %s "
- "use-need-wakeup: %s qid: %d",
- ovs_strerror(errno),
- xdpmode == XDP_COPY ? "SKB": "DRV",
- use_need_wakeup ? "true" : "false",
- queue_id);
+ VLOG(report_socket_failures ? VLL_ERR : VLL_DBG,
+ "xsk_socket__create failed (%s) mode: %s, "
+ "use-need-wakeup: %s, qid: %d",
+ ovs_strerror(errno), xdp_modes[mode].name,
+ use_need_wakeup ? "true" : "false", queue_id);
free(xsk);
return NULL;
}
@@ -375,8 +397,8 @@ xsk_configure_socket(struct xsk_umem_info *umem, uint32_t ifindex,
}
static struct xsk_socket_info *
-xsk_configure(int ifindex, int xdp_queue_id, int xdpmode,
- bool use_need_wakeup)
+xsk_configure(int ifindex, int xdp_queue_id, enum afxdp_mode mode,
+ bool use_need_wakeup, bool report_socket_failures)
{
struct xsk_socket_info *xsk;
struct xsk_umem_info *umem;
@@ -389,9 +411,7 @@ xsk_configure(int ifindex, int xdp_queue_id, int xdpmode,
memset(bufs, 0, NUM_FRAMES * FRAME_SIZE);
/* Create AF_XDP socket. */
- umem = xsk_configure_umem(bufs,
- NUM_FRAMES * FRAME_SIZE,
- xdpmode);
+ umem = xsk_configure_umem(bufs, NUM_FRAMES * FRAME_SIZE);
if (!umem) {
free_pagealign(bufs);
return NULL;
@@ -399,8 +419,8 @@ xsk_configure(int ifindex, int xdp_queue_id, int xdpmode,
VLOG_DBG("Allocated umem pool at 0x%"PRIxPTR, (uintptr_t) umem);
- xsk = xsk_configure_socket(umem, ifindex, xdp_queue_id, xdpmode,
- use_need_wakeup);
+ xsk = xsk_configure_socket(umem, ifindex, xdp_queue_id, mode,
+ use_need_wakeup, report_socket_failures);
if (!xsk) {
/* Clean up umem and xpacket pool. */
if (xsk_umem__delete(umem->umem)) {
@@ -414,12 +434,38 @@ xsk_configure(int ifindex, int xdp_queue_id, int xdpmode,
return xsk;
}
+static int
+xsk_configure_queue(struct netdev_linux *dev, int ifindex, int queue_id,
+ enum afxdp_mode mode, bool report_socket_failures)
+{
+ struct xsk_socket_info *xsk_info;
+
+ VLOG_DBG("%s: configuring queue: %d, mode: %s, use-need-wakeup: %s.",
+ netdev_get_name(&dev->up), queue_id, xdp_modes[mode].name,
+ dev->use_need_wakeup ? "true" : "false");
+ xsk_info = xsk_configure(ifindex, queue_id, mode, dev->use_need_wakeup,
+ report_socket_failures);
+ if (!xsk_info) {
+ VLOG(report_socket_failures ? VLL_ERR : VLL_DBG,
+ "%s: Failed to create AF_XDP socket on queue %d in %s mode.",
+ netdev_get_name(&dev->up), queue_id, xdp_modes[mode].name);
+ dev->xsks[queue_id] = NULL;
+ return -1;
+ }
+ dev->xsks[queue_id] = xsk_info;
+ atomic_init(&xsk_info->tx_dropped, 0);
+ xsk_info->outstanding_tx = 0;
+ xsk_info->available_rx = PROD_NUM_DESCS;
+ return 0;
+}
+
+
static int
xsk_configure_all(struct netdev *netdev)
{
struct netdev_linux *dev = netdev_linux_cast(netdev);
- struct xsk_socket_info *xsk_info;
int i, ifindex, n_rxq, n_txq;
+ int qid = 0;
ifindex = linux_get_ifindex(netdev_get_name(netdev));
@@ -429,23 +475,36 @@ xsk_configure_all(struct netdev *netdev)
n_rxq = netdev_n_rxq(netdev);
dev->xsks = xcalloc(n_rxq, sizeof *dev->xsks);
- /* Configure each queue. */
- for (i = 0; i < n_rxq; i++) {
- VLOG_DBG("%s: configure queue %d mode %s use-need-wakeup %s.",
- netdev_get_name(netdev), i,
- dev->xdpmode == XDP_COPY ? "SKB" : "DRV",
- dev->use_need_wakeup ? "true" : "false");
- xsk_info = xsk_configure(ifindex, i, dev->xdpmode,
- dev->use_need_wakeup);
- if (!xsk_info) {
- VLOG_ERR("Failed to create AF_XDP socket on queue %d.", i);
- dev->xsks[i] = NULL;
+ if (dev->xdp_mode == OVS_AF_XDP_MODE_BEST_EFFORT) {
+ /* Trying to configure first queue with different modes to
+ * find the most suitable. */
+ for (i = OVS_AF_XDP_MODE_NATIVE_ZC; i < OVS_AF_XDP_MODE_MAX; i++) {
+ if (!xsk_configure_queue(dev, ifindex, qid, i,
+ i == OVS_AF_XDP_MODE_MAX - 1)) {
+ dev->xdp_mode_in_use = i;
+ VLOG_INFO("%s: %s XDP mode will be in use.",
+ netdev_get_name(netdev), xdp_modes[i].name);
+ break;
+ }
+ }
+ if (i == OVS_AF_XDP_MODE_MAX) {
+ VLOG_ERR("%s: Failed to detect suitable XDP mode.",
+ netdev_get_name(netdev));
+ goto err;
+ }
+ qid++;
+ } else {
+ dev->xdp_mode_in_use = dev->xdp_mode;
+ }
+
+ /* Configure remaining queues. */
+ for (; qid < n_rxq; qid++) {
+ if (xsk_configure_queue(dev, ifindex, qid,
+ dev->xdp_mode_in_use, true)) {
+ VLOG_ERR("%s: Failed to create AF_XDP socket on queue %d.",
+ netdev_get_name(netdev), qid);
goto err;
}
- dev->xsks[i] = xsk_info;
- atomic_init(&xsk_info->tx_dropped, 0);
- xsk_info->outstanding_tx = 0;
- xsk_info->available_rx = PROD_NUM_DESCS;
}
n_txq = netdev_n_txq(netdev);
@@ -500,7 +559,7 @@ xsk_destroy_all(struct netdev *netdev)
if (dev->xsks[i]) {
xsk_destroy(dev->xsks[i]);
dev->xsks[i] = NULL;
- VLOG_INFO("Destroyed xsk[%d].", i);
+ VLOG_DBG("%s: Destroyed xsk[%d].", netdev_get_name(netdev), i);
}
}
@@ -510,7 +569,7 @@ xsk_destroy_all(struct netdev *netdev)
VLOG_INFO("%s: Removing xdp program.", netdev_get_name(netdev));
ifindex = linux_get_ifindex(netdev_get_name(netdev));
- xsk_remove_xdp_program(ifindex, dev->xdpmode);
+ xsk_remove_xdp_program(ifindex, dev->xdp_mode_in_use);
if (dev->tx_locks) {
for (i = 0; i < netdev_n_txq(netdev); i++) {
@@ -526,9 +585,10 @@ netdev_afxdp_set_config(struct netdev *netdev, const struct smap *args,
char **errp OVS_UNUSED)
{
struct netdev_linux *dev = netdev_linux_cast(netdev);
- const char *str_xdpmode;
- int xdpmode, new_n_rxq;
+ const char *str_xdp_mode;
+ enum afxdp_mode xdp_mode;
bool need_wakeup;
+ int new_n_rxq;
ovs_mutex_lock(&dev->mutex);
new_n_rxq = MAX(smap_get_int(args, "n_rxq", NR_QUEUE), 1);
@@ -539,14 +599,17 @@ netdev_afxdp_set_config(struct netdev *netdev, const struct smap *args,
return EINVAL;
}
- str_xdpmode = smap_get_def(args, "xdpmode", "skb");
- if (!strcasecmp(str_xdpmode, "drv")) {
- xdpmode = XDP_ZEROCOPY;
- } else if (!strcasecmp(str_xdpmode, "skb")) {
- xdpmode = XDP_COPY;
- } else {
- VLOG_ERR("%s: Incorrect xdpmode (%s).",
- netdev_get_name(netdev), str_xdpmode);
+ str_xdp_mode = smap_get_def(args, "xdp-mode", "best-effort");
+ for (xdp_mode = OVS_AF_XDP_MODE_BEST_EFFORT;
+ xdp_mode < OVS_AF_XDP_MODE_MAX;
+ xdp_mode++) {
+ if (!strcasecmp(str_xdp_mode, xdp_modes[xdp_mode].name)) {
+ break;
+ }
+ }
+ if (xdp_mode == OVS_AF_XDP_MODE_MAX) {
+ VLOG_ERR("%s: Incorrect xdp-mode (%s).",
+ netdev_get_name(netdev), str_xdp_mode);
ovs_mutex_unlock(&dev->mutex);
return EINVAL;
}
@@ -560,10 +623,10 @@ netdev_afxdp_set_config(struct netdev *netdev, const struct smap *args,
#endif
if (dev->requested_n_rxq != new_n_rxq
- || dev->requested_xdpmode != xdpmode
+ || dev->requested_xdp_mode != xdp_mode
|| dev->requested_need_wakeup != need_wakeup) {
dev->requested_n_rxq = new_n_rxq;
- dev->requested_xdpmode = xdpmode;
+ dev->requested_xdp_mode = xdp_mode;
dev->requested_need_wakeup = need_wakeup;
netdev_request_reconfigure(netdev);
}
@@ -578,8 +641,9 @@ netdev_afxdp_get_config(const struct netdev *netdev, struct smap *args)
ovs_mutex_lock(&dev->mutex);
smap_add_format(args, "n_rxq", "%d", netdev->n_rxq);
- smap_add_format(args, "xdpmode", "%s",
- dev->xdpmode == XDP_ZEROCOPY ? "drv" : "skb");
+ smap_add_format(args, "xdp-mode", "%s", xdp_modes[dev->xdp_mode].name);
+ smap_add_format(args, "xdp-mode-in-use", "%s",
+ xdp_modes[dev->xdp_mode_in_use].name);
smap_add_format(args, "use-need-wakeup", "%s",
dev->use_need_wakeup ? "true" : "false");
ovs_mutex_unlock(&dev->mutex);
@@ -596,7 +660,7 @@ netdev_afxdp_reconfigure(struct netdev *netdev)
ovs_mutex_lock(&dev->mutex);
if (netdev->n_rxq == dev->requested_n_rxq
- && dev->xdpmode == dev->requested_xdpmode
+ && dev->xdp_mode == dev->requested_xdp_mode
&& dev->use_need_wakeup == dev->requested_need_wakeup
&& dev->xsks) {
goto out;
@@ -607,9 +671,9 @@ netdev_afxdp_reconfigure(struct netdev *netdev)
netdev->n_rxq = dev->requested_n_rxq;
netdev->n_txq = netdev->n_rxq;
- dev->xdpmode = dev->requested_xdpmode;
+ dev->xdp_mode = dev->requested_xdp_mode;
VLOG_INFO("%s: Setting XDP mode to %s.", netdev_get_name(netdev),
- dev->xdpmode == XDP_ZEROCOPY ? "DRV" : "SKB");
+ xdp_modes[dev->xdp_mode].name);
if (setrlimit(RLIMIT_MEMLOCK, &r)) {
VLOG_ERR("setrlimit(RLIMIT_MEMLOCK) failed: %s", ovs_strerror(errno));
@@ -618,7 +682,8 @@ netdev_afxdp_reconfigure(struct netdev *netdev)
err = xsk_configure_all(netdev);
if (err) {
- VLOG_ERR("AF_XDP device %s reconfig failed.", netdev_get_name(netdev));
+ VLOG_ERR("%s: AF_XDP device reconfiguration failed.",
+ netdev_get_name(netdev));
}
netdev_change_seq_changed(netdev);
out:
@@ -638,17 +703,9 @@ netdev_afxdp_get_numa_id(const struct netdev *netdev)
}
static void
-xsk_remove_xdp_program(uint32_t ifindex, int xdpmode)
+xsk_remove_xdp_program(uint32_t ifindex, enum afxdp_mode mode)
{
- uint32_t flags;
-
- flags = XDP_FLAGS_UPDATE_IF_NOEXIST;
-
- if (xdpmode == XDP_COPY) {
- flags |= XDP_FLAGS_SKB_MODE;
- } else if (xdpmode == XDP_ZEROCOPY) {
- flags |= XDP_FLAGS_DRV_MODE;
- }
+ uint32_t flags = xdp_modes[mode].xdp_flags | XDP_FLAGS_UPDATE_IF_NOEXIST;
bpf_set_link_xdp_fd(ifindex, -1, flags);
}
@@ -662,7 +719,7 @@ signal_remove_xdp(struct netdev *netdev)
ifindex = linux_get_ifindex(netdev_get_name(netdev));
VLOG_WARN("Force removing xdp program.");
- xsk_remove_xdp_program(ifindex, dev->xdpmode);
+ xsk_remove_xdp_program(ifindex, dev->xdp_mode_in_use);
}
static struct dp_packet_afxdp *
@@ -782,7 +839,8 @@ netdev_afxdp_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch,
}
static inline int
-kick_tx(struct xsk_socket_info *xsk_info, int xdpmode, bool use_need_wakeup)
+kick_tx(struct xsk_socket_info *xsk_info, enum afxdp_mode mode,
+ bool use_need_wakeup)
{
int ret, retries;
static const int KERNEL_TX_BATCH_SIZE = 16;
@@ -791,11 +849,11 @@ kick_tx(struct xsk_socket_info *xsk_info, int xdpmode, bool use_need_wakeup)
return 0;
}
- /* In SKB_MODE packet transmission is synchronous, and the kernel xmits
+ /* In generic mode packet transmission is synchronous, and the kernel xmits
* only TX_BATCH_SIZE(16) packets for a single sendmsg syscall.
* So, we have to kick the kernel (n_packets / 16) times to be sure that
* all packets are transmitted. */
- retries = (xdpmode == XDP_COPY)
+ retries = (mode == OVS_AF_XDP_MODE_GENERIC)
? xsk_info->outstanding_tx / KERNEL_TX_BATCH_SIZE
: 0;
kick_retry:
@@ -962,7 +1020,7 @@ __netdev_afxdp_batch_send(struct netdev *netdev, int qid,
&orig);
COVERAGE_INC(afxdp_tx_full);
afxdp_complete_tx(xsk_info);
- kick_tx(xsk_info, dev->xdpmode, dev->use_need_wakeup);
+ kick_tx(xsk_info, dev->xdp_mode_in_use, dev->use_need_wakeup);
error = ENOMEM;
goto out;
}
@@ -986,7 +1044,7 @@ __netdev_afxdp_batch_send(struct netdev *netdev, int qid,
xsk_ring_prod__submit(&xsk_info->tx, dp_packet_batch_size(batch));
xsk_info->outstanding_tx += dp_packet_batch_size(batch);
- ret = kick_tx(xsk_info, dev->xdpmode, dev->use_need_wakeup);
+ ret = kick_tx(xsk_info, dev->xdp_mode_in_use, dev->use_need_wakeup);
if (OVS_UNLIKELY(ret)) {
VLOG_WARN_RL(&rl, "%s: error sending AF_XDP packet: %s.",
netdev_get_name(netdev), ovs_strerror(ret));
@@ -1052,10 +1110,11 @@ netdev_afxdp_construct(struct netdev *netdev)
/* Queues should not be used before the first reconfiguration. Clearing. */
netdev->n_rxq = 0;
netdev->n_txq = 0;
- dev->xdpmode = 0;
+ dev->xdp_mode = OVS_AF_XDP_MODE_UNSPEC;
+ dev->xdp_mode_in_use = OVS_AF_XDP_MODE_UNSPEC;
dev->requested_n_rxq = NR_QUEUE;
- dev->requested_xdpmode = XDP_COPY;
+ dev->requested_xdp_mode = OVS_AF_XDP_MODE_BEST_EFFORT;
dev->requested_need_wakeup = NEED_WAKEUP_DEFAULT;
dev->xsks = NULL;
diff --git a/lib/netdev-afxdp.h b/lib/netdev-afxdp.h
index e2f400b72..4fe861d2d 100644
--- a/lib/netdev-afxdp.h
+++ b/lib/netdev-afxdp.h
@@ -25,6 +25,15 @@
/* These functions are Linux AF_XDP specific, so they should be used directly
* only by Linux-specific code. */
+enum afxdp_mode {
+ OVS_AF_XDP_MODE_UNSPEC,
+ OVS_AF_XDP_MODE_BEST_EFFORT,
+ OVS_AF_XDP_MODE_NATIVE_ZC,
+ OVS_AF_XDP_MODE_NATIVE,
+ OVS_AF_XDP_MODE_GENERIC,
+ OVS_AF_XDP_MODE_MAX,
+};
+
struct netdev;
struct xsk_socket_info;
struct xdp_umem;
diff --git a/lib/netdev-linux-private.h b/lib/netdev-linux-private.h
index c14f2fb81..8873caa9d 100644
--- a/lib/netdev-linux-private.h
+++ b/lib/netdev-linux-private.h
@@ -100,10 +100,14 @@ struct netdev_linux {
/* AF_XDP information. */
struct xsk_socket_info **xsks;
int requested_n_rxq;
- int xdpmode; /* AF_XDP running mode: driver or skb. */
- int requested_xdpmode;
+
+ enum afxdp_mode xdp_mode; /* Configured AF_XDP mode. */
+ enum afxdp_mode requested_xdp_mode; /* Requested AF_XDP mode. */
+ enum afxdp_mode xdp_mode_in_use; /* Effective AF_XDP mode. */
+
bool use_need_wakeup;
bool requested_need_wakeup;
+
struct ovs_spin *tx_locks; /* spin lock array for TX queues. */
#endif
};
diff --git a/tests/system-afxdp-macros.at b/tests/system-afxdp-macros.at
index f0683c0a9..5ee2ceb1a 100644
--- a/tests/system-afxdp-macros.at
+++ b/tests/system-afxdp-macros.at
@@ -30,10 +30,3 @@ m4_define([CONFIGURE_VETH_OFFLOADS],
AT_CHECK([ethtool -K $1 txvlan off], [0], [ignore], [ignore])
]
)
-
-# OVS_START_L7([namespace], [protocol])
-#
-# AF_XDP doesn't work with TCP over virtual interfaces for now.
-#
-m4_define([OVS_START_L7],
- [AT_SKIP_IF([:])])
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index efdfb83bb..02a68deb1 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -3107,18 +3107,38 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \
-
+ "enum": ["set", ["best-effort", "native-with-zerocopy",
+ "native", "generic"]]}'>
Specifies the operational mode of the XDP program.
- If "drv", the XDP program is loaded into the device driver with
- zero-copy RX and TX enabled. This mode requires device driver with
- AF_XDP support and has the best performance.
- If "skb", the XDP program is using generic XDP mode in kernel with
- extra data copying between userspace and kernel. No device driver
- support is needed. Note that this is afxdp netdev type only.
- Defaults to "skb" mode.
+
+ In native-with-zerocopy
mode the XDP program is loaded
+ into the device driver with zero-copy RX and TX enabled. This mode
+ requires device driver support and has the best performance because
+ there should be no copying of packets.
+
+
+ native
is the same as
+ native-with-zerocopy
, but without zero-copy
+ capability. This requires at least one copy between kernel and the
+ userspace. This mode also requires support from device driver.
+
+
+ In generic
case the XDP program in kernel works after
+ skb allocation on early stages of packet processing inside the
+ network stack. This mode doesn't require driver support, but has
+ much lower performance.
+
+
+ best-effort
tries to detect and choose the best
+ (fastest) from the available modes for current interface.
+
+
+ Note that this option is specific to netdev-afxdp.
+ Defaults to best-effort
mode.
+