From patchwork Tue Mar 24 23:50:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Maximets X-Patchwork-Id: 1261072 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.137; helo=fraxinus.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=ovn.org Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48n7LJ3qNPz9sPR for ; Wed, 25 Mar 2020 10:51:02 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id E2CE086472; Tue, 24 Mar 2020 23:50:59 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7u8rcCoN_k_3; Tue, 24 Mar 2020 23:50:59 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by fraxinus.osuosl.org (Postfix) with ESMTP id 0036282033; Tue, 24 Mar 2020 23:50:59 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id C7F50C089F; Tue, 24 Mar 2020 23:50:58 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 169AFC0177 for ; Tue, 24 Mar 2020 23:50:58 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 0DB3486501 for ; Tue, 24 Mar 2020 23:50:58 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1DXAbiRgkGcN for ; Tue, 24 Mar 2020 23:50:56 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from relay4-d.mail.gandi.net (relay4-d.mail.gandi.net [217.70.183.196]) by whitealder.osuosl.org (Postfix) with ESMTPS id D839786416 for ; Tue, 24 Mar 2020 23:50:55 +0000 (UTC) X-Originating-IP: 90.177.210.238 Received: from im-t490s.redhat.com (238.210.broadband10.iol.cz [90.177.210.238]) (Authenticated sender: i.maximets@ovn.org) by relay4-d.mail.gandi.net (Postfix) with ESMTPSA id 59402E0004; Tue, 24 Mar 2020 23:50:50 +0000 (UTC) From: Ilya Maximets To: ovs-dev@openvswitch.org Date: Wed, 25 Mar 2020 00:50:45 +0100 Message-Id: <20200324235045.704399-1-i.maximets@ovn.org> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Cc: Ilya Maximets Subject: [ovs-dev] [PATCH] dpif-netdev: Force port reconfiguration to change dynamic_txqs. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" In case number of polling threads goes from exact number of Tx queues in port to higher value while set_tx_multiq() not implemented or not requesting reconfiguration, port will not be reconfigured and datapath will continue using static Tx queue ids leading to crash. Ex.: Assuming that port p0 supports up to 4 Tx queues and doesn't support set_tx_multiq() method. For example, netdev-afxdp could be the case, because it could have multiple Tx queues, but doesn't have set_tx_multiq() implementation because number of Tx queues always equals to number of Rx queues. 1. Configuring pmd-cpu-mask to have 3 pmd threads. 2. Adding port p0 to OVS. At this point wanted_txqs = 4 (3 for pmd threads + 1 for non-pmd). Port reconfigured to have 4 Tx queues successfully. dynamic_txqs = (4 < 4) = false; 3. Configuring pmd-cpu-mask to have 10 pmd threads. At this point wanted_txqs = 11 (10 for pmd threads + 1 for non-pmd). Since set_tx_multiq() is not implemented, netdev doesn't request reconfiguration and 'dynamic_txqs' remains in 'false' state. 4. Since 'dynamic_txqs == false', dpif-netdev uses static Tx queue ids that are in range [0, 10] while device only supports 4 leading to unwanted behavior and crashes. Fix that by marking for reconfiguration all the ports that will likely change their 'dynamic_txqs' value. It looks like the issue could be reproduced only with afxdp ports, because all other non-dpdk ports ignores Tx queue ids and dpdk ports requests for reconfiguration on set_tx_multiq(). Reported-by: William Tu Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2020-March/368364.html Fixes: e32971b8ddb4 ("dpif-netdev: Centralized threads and queues handling code.") Signed-off-by: Ilya Maximets Acked-by: William Tu Acked-by: Kevin Traynor --- Since the issue could be reproduced only with afxdp ports, this should be backported only down to 2.12. lib/dpif-netdev.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index a798db45d..e456cc9be 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -4941,9 +4941,17 @@ reconfigure_datapath(struct dp_netdev *dp) /* Check for all the ports that need reconfiguration. We cache this in * 'port->need_reconfigure', because netdev_is_reconf_required() can - * change at any time. */ + * change at any time. + * Also mark for reconfiguration all ports which will likely change their + * 'dynamic_txqs' parameter. It's required to stop using them before + * changing this setting and it's simpler to mark ports here and allow + * 'pmd_remove_stale_ports' to remove them from threads. There will be + * no actual reconfiguration in 'port_reconfigure' because it's + * unnecessary. */ HMAP_FOR_EACH (port, node, &dp->ports) { - if (netdev_is_reconf_required(port->netdev)) { + if (netdev_is_reconf_required(port->netdev) + || (port->dynamic_txqs + != (netdev_n_txq(port->netdev) < wanted_txqs))) { port->need_reconfigure = true; } }