From patchwork Sun Feb 17 17:34:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ophir Munk X-Patchwork-Id: 1043698 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=mellanox.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 442Yz25RX3z9s7T for ; Mon, 18 Feb 2019 04:34:38 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id CDF48B49; Sun, 17 Feb 2019 17:34:35 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id C84A1B43 for ; Sun, 17 Feb 2019 17:34:34 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by smtp1.linuxfoundation.org (Postfix) with ESMTP id 8304B782 for ; Sun, 17 Feb 2019 17:34:33 +0000 (UTC) Received: from Internal Mail-Server by MTLPINE1 (envelope-from ophirmu@mellanox.com) with ESMTPS (AES256-SHA encrypted); 17 Feb 2019 19:34:30 +0200 Received: from localhost.localdomain (pegasus05.mtr.labs.mlnx [10.210.16.100]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id x1HHYUcX000716; Sun, 17 Feb 2019 19:34:30 +0200 Received: from pegasus05.mtr.labs.mlnx (localhost [127.0.0.1]) by localhost.localdomain (8.14.7/8.14.7) with ESMTP id x1HHYUhk024608; Sun, 17 Feb 2019 17:34:30 GMT Received: (from root@localhost) by pegasus05.mtr.labs.mlnx (8.14.7/8.14.7/Submit) id x1HHYUps024607; Sun, 17 Feb 2019 17:34:30 GMT From: Ophir Munk To: ovs-dev@openvswitch.org Date: Sun, 17 Feb 2019 17:34:27 +0000 Message-Id: <1550424867-24572-1-git-send-email-ophirmu@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1550411348-873-1-git-send-email-ophirmu@mellanox.com> References: <1550411348-873-1-git-send-email-ophirmu@mellanox.com> X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_NONE, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH v3] doc: Add "Representors" topic document X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This details how to configure representors ports. Signed-off-by: Ophir Munk --- v1: First version v2: Following patch reviews. https://patchwork.ozlabs.org/patch/1039515/ Mainly add how to create representors with Intel and Mellanox NICs. v3: Fix Checkpatch complains on line lengths but still leave warnings on too long printouts Documentation/topics/dpdk/phy.rst | 139 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 139 insertions(+) diff --git a/Documentation/topics/dpdk/phy.rst b/Documentation/topics/dpdk/phy.rst index 1470623..93d74df 100644 --- a/Documentation/topics/dpdk/phy.rst +++ b/Documentation/topics/dpdk/phy.rst @@ -219,6 +219,145 @@ For more information please refer to the `DPDK Port Hotplug Framework`__. __ http://dpdk.org/doc/guides/prog_guide/port_hotplug_framework.html#hotplug +.. _representors: + +Representors +------------ + +DPDK representors enable configuring a phy port to a guest (VM) machine. + +OVS resides in the hypervisor which has one or more physical interfaces also +known as the physical functions (PFs). If a PF supports SR-IOV it can be used +to enable communication with the VMs via Virtual Functions (VFs). +The VFs are virtual PCIe devices created from the physical Ethernet controller. + +DPDK models a physical interface as a rte device on top of which an eth +device is created. +DPDK (version 18.xx) introduced the representors eth devices. +A representor device represents the VF eth device (VM side) on the hypervisor +side and operates on top of a PF. +Representors are multi devices created on top of one PF. + +For more information, refer to the `DPDK documentation`__. + +__ https://doc.dpdk.org/guides-18.11/prog_guide/switch_representation.html + +Prior to port representors there was a one-to-one relationship between the PF +and the eth device. With port representors the relationship becomes one PF to +many eth devices. +In case of two representors ports, when one of the ports is closed - the PCI +bus cannot be detached until the second representor port is closed as well. + +.. _representors-configuration: + +When configuring a PF-based port, OVS traditionally assigns the device PCI +address in devargs. For an existing bridge called ``br0`` and PCI address +``0000:08:00.0`` an ``add-port`` command is written as:: + + $ ovs-vsctl add-port br0 dpdk-pf -- set Interface dpdk-pf type=dpdk \ + options:dpdk-devargs=0000:08:00.0 + +When configuring a VF-based port, DPDK uses an extended devargs syntax which +has the following format:: + + BDBF,representor=[] + +This syntax shows that a representor is an enumerated eth device (with +a representor ID) which uses the PF PCI address. +The following commands add representors 3 and 5 using PCI device address +``0000:08:00.0``:: + + $ ovs-vsctl add-port br0 dpdk-rep3 -- set Interface dpdk-rep3 type=dpdk \ + options:dpdk-devargs=0000:08:00.0,representor=[3] + + $ ovs-vsctl add-port br0 dpdk-rep5 -- set Interface dpdk-rep5 type=dpdk \ + options:dpdk-devargs=0000:08:00.0,representor=[5] + +.. important:: + + Representors ports are configured prior to OVS invocation and independently + of it, or by other means as well. Please consult a NIC vendor instructions + on how to establish representors. + +.. _multi-dev-configuration: + +**Intel NICs ixgbe and i40e** + +In the following example we create one representor on PF address +``0000:05:00.0``. Once the NIC is bounded to a DPDK compatible PMD the +representor is created:: + + # echo 1 > /sys/bus/pci/devices/0000\:05\:00.0/max_vfs + +**Mellanox NICs ConnectX-4, ConnectX-5 and ConnectX-6** + +In the following example we create two representors on PF address +``0000:05:00.0`` and net device name ``enp3s0f0``. + +- Ensure SR-IOV is enabled on the system. + +Enable IOMMU in Linux by adding ``intel_iommu=on`` to kernel parameters, for +example, using GRUB (see /etc/grub/grub.conf). + +- Verify the PF PCI address prior to representors creation:: + + # lspci | grep Mellanox + 05:00.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4] + 05:00.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4] + +- Create the two VFs on the compute node:: + + # echo 2 > /sys/class/net/enp3s0f0/device/sriov_numvfs + + Verify the VFs creation:: + + # lspci | grep Mellanox + 05:00.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4] + 05:00.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4] + 05:00.2 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] + 05:00.3 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] + +- Unbind the relevant VFs 0000:05:00.2..0000:05:00.3:: + + # echo 0000:05:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind + # echo 0000:05:00.3 > /sys/bus/pci/drivers/mlx5_core/unbind + +- Change e-switch mode. + +The Mellanox NIC has an e-switch on it. Change the e-switch mode from +legacy to switchdev using the PF PCI address:: + + # sudo devlink dev eswitch set pci/0000:05:00.0 mode switchdev + +This will create the VF representors network devices in the host OS. + +- After setting the PF to switchdev mode bind back the relevant VFs:: + + # echo 0000:05:00.2 > /sys/bus/pci/drivers/mlx5_core/bind + # echo 0000:05:00.3 > /sys/bus/pci/drivers/mlx5_core/bind + +- Restart Open vSwitch + +To verify representors correct configuration, execute:: + + $ ovs-vsctl show + +and make sure no errors are indicated. + +.. _vendor_configuration: + +Port representors are an example of multi devices. There are NICs which support +multi devices by other methods than representors for which a generic devargs +syntax is used. The generic syntax is based on the device mac address:: + + class=eth,mac= + +For example, the following command adds a port to a bridge called ``br0`` using +an eth device whose mac address is ``00:11:22:33:44:55``:: + + $ ovs-vsctl add-port br0 dpdk-mac -- set Interface dpdk-mac type=dpdk \ + options:dpdk-devargs="class=eth,mac=00:11:22:33:44:55" + Jumbo Frames ------------