From patchwork Mon Apr 1 22:46:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: William Tu X-Patchwork-Id: 1073459 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="dlIAas9b"; dkim-atps=neutral Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44Y6tF5Z54z9sSV for ; Tue, 2 Apr 2019 09:47:32 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id D60EE1997; Mon, 1 Apr 2019 22:47:28 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 3763714ED for ; Mon, 1 Apr 2019 22:47:28 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-wr1-f65.google.com (mail-wr1-f65.google.com [209.85.221.65]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 96CE1712 for ; Mon, 1 Apr 2019 22:47:26 +0000 (UTC) Received: by mail-wr1-f65.google.com with SMTP id g3so14067505wrx.9 for ; Mon, 01 Apr 2019 15:47:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id; bh=FbbQmdnZsWW4O32PD8U4SWtbmtLMnZI/A9u7pHsS/H8=; b=dlIAas9bEIum3vkMH6/QStPAneVbMwEzB2cm8blZP+QQq4eQhXqyWtQ/jfUUI8iGPk sxpCGKN0BO5Hv5Re5X3G8yBQ2IzC3sPn884Qv1tDXLqrsnWxZLprpwgxnBlLWA/Gg5BX +kcY+ln3u/AufusXiqKxPuBhiCRCqRqM3EuFBErt5KZ/PfMNsfPW12vKHs++P7aK5oD0 JpVFf3z1QDdOBNUxFuVcR3akZlxFS+5LDLVGxBQA5Us2n9qNQhdOXfjTcoudkG8iWdXa RblZqnpg4Osc87rsHwM3K8FU+V+WfHHeI0yw3qPA4meinlXn0Idq+2pikLg2IFkHv/RI I0lw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id; bh=FbbQmdnZsWW4O32PD8U4SWtbmtLMnZI/A9u7pHsS/H8=; b=eLBJIj/H/pyHtigAIw75auOZfUqdS69SSCAcbaR54sCfeB1JOK3A1Kal04MHy+Q1JI gzOooSBSiDJG27xcuSYTGM1MAzEGiknPd7opwJoOaaBXCxTc9GHiaqadNSQwDIyRZPzu NHHHtZtsgQuP/V0+ofpBmrfrW8TUQCnTODrzHqWcyum3Gz7Pbr9a1744fbwbMWXXzjSH QPJQU+BJ/QpT/MRQHdrK9fGI5D5EkBmHXhV2dNU14OLH0ZatAYgvcmvIau9iaWUUaGmN Qbz13t2Sw92nqsbIOCmkrTvV5uiJ/m+t/z0i6VEvsmPxzLrz5sUqfDXyjtaomulXrCJH ZyNg== X-Gm-Message-State: APjAAAWwEMjvmLZBG/uQMmHIbChyrlaBzzO1g/Pqq2yXS/VCEoqdvmzG 5CJHnhp1qFjhTIbXknk4stjcwtaW X-Google-Smtp-Source: APXvYqzPmPBUnqZOgiZExOVVncporlJfbVF8NZOHStnXSG7DKb4BbP+2Ofm4QOssLjmuxvrOZB7G/g== X-Received: by 2002:adf:f7c2:: with SMTP id a2mr25201787wrq.242.1554158844829; Mon, 01 Apr 2019 15:47:24 -0700 (PDT) Received: from sc9-mailhost2.vmware.com ([66.170.99.2]) by smtp.gmail.com with ESMTPSA id w9sm27720055wmi.0.2019.04.01.15.47.22 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 01 Apr 2019 15:47:24 -0700 (PDT) From: William Tu To: dev@openvswitch.org, trozet@redhat.com, bmcfall@redhat.com, echaudro@redhat.com, magnus.karlsson@gmail.com, bjorn.topel@gmail.com, tuc@vmware.com Date: Mon, 1 Apr 2019 15:46:48 -0700 Message-Id: <1554158812-44622-1-git-send-email-u9012063@gmail.com> X-Mailer: git-send-email 2.7.4 X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH RFCv4 0/4] AF_XDP netdev support for OVS X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org The patch series introduces AF_XDP support for OVS netdev. AF_XDP is a new address family working together with eBPF. In short, a socket with AF_XDP family can receive and send packets from an eBPF/XDP program attached to the netdev. For more details about AF_XDP, please see linux kernel's Documentation/networking/af_xdp.rst OVS has a couple of netdev types, i.e., system, tap, or internal. The patch first adds a new netdev types called "afxdp", and implement its configuration, packet reception, and transmit functions. Since the AF_XDP socket, xsk, operates in userspace, once ovs-vswitchd receives packets from xsk, the proposed architecture re-uses the existing userspace dpif-netdev datapath. As a result, most of the packet processing happens at the userspace instead of linux kernel. Architecure =========== _ | +-------------------+ | | ovs-vswitchd |<-->ovsdb-server | +-------------------+ | | ofproto |<-->OpenFlow controllers | +--------+-+--------+ | | netdev | |ofproto-| userspace | +--------+ | dpif | | | netdev | +--------+ | |provider| | dpif | | +---||---+ +--------+ | || | dpif- | | || | netdev | |_ || +--------+ || _ +---||-----+--------+ | | af_xdp prog + | kernel | | xsk_map | |_ +--------||---------+ || physical NIC To simply start, create a ovs userspace bridge using dpif-netdev by setting the datapath_type to netdev: # ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev And attach a linux netdev with type afxdp: # ovs-vsctl add-port br0 afxdp-p0 -- \ set interface afxdp-p0 type="afxdp" Performance =========== For this version, v4, I mainly focus on making the features right with libbpf AF_XDP API and use the AF_XDP SKB mode, which is the slower set-up. My next version is to measure the performance and add optimizations. Documentation ============= Most of the design details are described in the paper presetned at Linux Plumber 2018, "Bringing the Power of eBPF to Open vSwitch"[1], section 4, and slides[2]. This path uses a not-yet upstreamed feature called XDP_ATTACH[3], described in section 3.1, which is a built-in XDP program for the AF_XDP. This greatly simplifies the management of XDP/eBPF programs. [1] http://vger.kernel.org/lpc_net2018_talks/ovs-ebpf-afxdp.pdf [2] http://vger.kernel.org/lpc_net2018_talks/ovs-ebpf-lpc18-presentation.pdf [3] http://vger.kernel.org/lpc_net2018_talks/lpc18_paper_af_xdp_perf-v2.pdf For installation and configuration guide, see # Documentation/intro/install/bpf.rst Test Cases ========== Test cases are created using namespaces and veth peer, with AF_XDP socket attached to the veth (thus the SKB_MODE). By issuing "make check-afxdp", the patch shows the following: AF_XDP netdev datapath-sanity 1: datapath - ping between two ports ok 2: datapath - ping between two ports on vlan ok 3: datapath - ping6 between two ports ok 4: datapath - ping6 between two ports on vlan ok 5: datapath - ping over vxlan tunnel ok 6: datapath - ping over vxlan6 tunnel ok 7: datapath - ping over gre tunnel ok 8: datapath - ping over erspan v1 tunnel ok 9: datapath - ping over erspan v2 tunnel ok 10: datapath - ping over ip6erspan v1 tunnel ok 11: datapath - ping over ip6erspan v2 tunnel ok 12: datapath - ping over geneve tunnel ok 13: datapath - ping over geneve6 tunnel ok 14: datapath - clone action ok 15: datapath - basic truncate action ok conntrack 16: conntrack - controller ok 17: conntrack - force commit ok 18: conntrack - ct flush by 5-tuple ok 19: conntrack - IPv4 ping ok 20: conntrack - get_nconns and get/set_maxconns ok 21: conntrack - IPv6 ping ok system-ovn 22: ovn -- 2 LRs connected via LS, gateway router, SNAT and DNAT ok 23: ovn -- 2 LRs connected via LS, gateway router, easy SNAT ok 24: ovn -- multiple gateway routers, SNAT and DNAT ok 25: ovn -- load-balancing ok 26: ovn -- load-balancing - same subnet. ok 27: ovn -- load balancing in gateway router ok 28: ovn -- multiple gateway routers, load-balancing ok 29: ovn -- load balancing in router with gateway router port ok 30: ovn -- DNAT and SNAT on distributed router - N/S ok 31: ovn -- DNAT and SNAT on distributed router - E/W ok --- v1->v2: - add a list to maintain unused umem elements - remove copy from rx umem to ovs internal buffer - use hugetlb to reduce misses (not much difference) - use pmd mode netdev in OVS (huge performance improve) - remove malloc dp_packet, instead put dp_packet in umem v2->v3: - rebase on the OVS master, 7ab4b0653784 ("configure: Check for more specific function to pull in pthread library.") - remove the dependency on libbpf and dpif-bpf. instead, use the built-in XDP_ATTACH feature. - data structure optimizations for better performance, see[1] - more test cases support v3: https://mail.openvswitch.org/pipermail/ovs-dev/2018-November/354179.html v3->v4: - Use AF_XDP API provided by libbpf - Remove the dependency on XDP_ATTACH kernel patch set - Add documentation, bpf.rst William Tu (4): Add libbpf build support. netdev-afxdp: add new netdev type for AF_XDP tests: add AF_XDP netdev test cases. afxdp netdev: add documentation and configuration. Documentation/automake.mk | 1 + Documentation/index.rst | 1 + Documentation/intro/install/bpf.rst | 182 +++++++ Documentation/intro/install/index.rst | 1 + acinclude.m4 | 20 + configure.ac | 1 + lib/automake.mk | 7 +- lib/dp-packet.c | 12 + lib/dp-packet.h | 32 +- lib/dpif-netdev.c | 2 +- lib/netdev-afxdp.c | 491 +++++++++++++++++ lib/netdev-afxdp.h | 39 ++ lib/netdev-linux.c | 78 ++- lib/netdev-provider.h | 1 + lib/netdev.c | 1 + lib/xdpsock.c | 179 +++++++ lib/xdpsock.h | 129 +++++ tests/automake.mk | 17 + tests/system-afxdp-macros.at | 153 ++++++ tests/system-afxdp-testsuite.at | 26 + tests/system-afxdp-traffic.at | 978 ++++++++++++++++++++++++++++++++++ 21 files changed, 2345 insertions(+), 6 deletions(-) create mode 100644 Documentation/intro/install/bpf.rst create mode 100644 lib/netdev-afxdp.c create mode 100644 lib/netdev-afxdp.h create mode 100644 lib/xdpsock.c create mode 100644 lib/xdpsock.h create mode 100644 tests/system-afxdp-macros.at create mode 100644 tests/system-afxdp-testsuite.at create mode 100644 tests/system-afxdp-traffic.at