From patchwork Wed Oct 18 14:23:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Marchand X-Patchwork-Id: 1850931 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=MmUvGwuQ; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.137; helo=smtp4.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4S9Y4n2F5hz20Zj for ; Thu, 19 Oct 2023 01:24:09 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 57AC24202A; Wed, 18 Oct 2023 14:24:06 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 57AC24202A Authentication-Results: smtp4.osuosl.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=MmUvGwuQ X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id g-URPu_3IWH2; Wed, 18 Oct 2023 14:24:05 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp4.osuosl.org (Postfix) with ESMTPS id 3ACFA41FAA; Wed, 18 Oct 2023 14:24:04 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 3ACFA41FAA Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 17F12C0071; Wed, 18 Oct 2023 14:24:04 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp4.osuosl.org (smtp4.osuosl.org [IPv6:2605:bc80:3010::137]) by lists.linuxfoundation.org (Postfix) with ESMTP id BC859C0032 for ; Wed, 18 Oct 2023 14:24:02 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 8A9E941FC8 for ; Wed, 18 Oct 2023 14:24:02 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 8A9E941FC8 X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aOBheqS2yFxV for ; Wed, 18 Oct 2023 14:24:01 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by smtp4.osuosl.org (Postfix) with ESMTPS id 8CA6A41FAA for ; Wed, 18 Oct 2023 14:24:01 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 8CA6A41FAA DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1697639040; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=o9YeNi3R2s3HSLCmgjb1sev3QWS2oXOz9gDI0+7WgV8=; b=MmUvGwuQUnbZU5pvDzvUh6dTe761jFu24Zp/1Q1aShl1jeIV/iSBiJvnFaOILDgTpM7DYO pWjGK4xQihLSDMFeog1geRRw4QOkxMsx4vqKJN0lVZbEL5zAdfHKazoiHEyxS/iWywG1MS RMPzNksB+FzGsNrLqir/b3MLqN6invU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-33-36SDc2bYOnWRbQrcu1Vsog-1; Wed, 18 Oct 2023 10:23:57 -0400 X-MC-Unique: 36SDc2bYOnWRbQrcu1Vsog-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1FDD710201FB; Wed, 18 Oct 2023 14:23:57 +0000 (UTC) Received: from dmarchan.redhat.com (unknown [10.45.225.182]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1D7F6492BFA; Wed, 18 Oct 2023 14:23:55 +0000 (UTC) From: David Marchand To: dev@openvswitch.org Date: Wed, 18 Oct 2023 16:23:53 +0200 Message-ID: <20231018142353.2439714-1-david.marchand@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.10 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Cc: Ilya Maximets Subject: [ovs-dev] [PATCH v2] ofproto-dpif-upcall: Pause revalidators when purging. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" This issue has been observed when running traffic tests with a dpdk enabled userspace datapath (though those tests are added in a separate series). However, the described issue also affects the kernel datapath which is why this patch is sent separately. A main thread executing the 'revalidator/purge' command could race with revalidator threads that can be dumping/sweeping the purged flows at the same time. This race can be reproduced (with dpif debug logs) by running the conntrack - ICMP related unit tests with the userspace datapath: 2023-10-09T14:11:55.242Z|00177|unixctl|DBG|received request revalidator/purge[], id=0 2023-10-09T14:11:55.242Z|00044|dpif(revalidator17)|DBG|netdev@ovs-netdev: flow_dump ufid:68ff6817-fb3b-4b30-8412-9cf175318294 , packets:0, bytes:0, used:never 2023-10-09T14:11:55.242Z|00178|dpif|DBG|netdev@ovs-netdev: flow_del ufid:07046e91-30a6-4862-9048-1a76b5a88a5b recirc_id(0),dp_hash(0),skb_priority(0),in_port(2),skb_mark(0), ct_state(0),ct_zone(0),ct_mark(0),ct_label(0), packet_type(ns=0,id=0), eth(src=a6:0a:bf:e2:f3:f2,dst=62:23:0f:f6:2c:75), eth_type(0x0800),ipv4(src=10.1.1.1,dst=10.1.1.2,proto=17,tos=0, ttl=64,frag=no),udp(src=37380,dst=10000), packets:0, bytes:0, used:never ... 2023-10-09T14:11:55.242Z|00049|dpif(revalidator17)|WARN|netdev@ovs-netdev: failed to flow_get (No such file or directory) ufid:07046e91-30a6-4862-9048-1a76b5a88a5b , packets:0, bytes:0, used:never 2023-10-09T14:11:55.242Z|00050|ofproto_dpif_upcall(revalidator17)|WARN| Failed to acquire udpif_key corresponding to unexpected flow (No such file or directory): ufid:07046e91-30a6-4862-9048-1a76b5a88a5b ... 2023-10-09T14:11:55.242Z|00183|unixctl|DBG|replying with success, id=0: "" To avoid this race, a first part of the fix is to pause (if not already paused) the revalidators while the main thread is purging the datapath flows. Then a second issue is observed by running the same unit test with the kernel datapath. Its dpif implementation dumps flows via a netlink request (see dpif_flow_dump_create(), dpif_netlink_flow_dump_create(), nl_dump_start(), nl_sock_send__()) in the leader revalidator thread, before pausing revalidators: 2023-10-09T14:44:28.742Z|00122|unixctl|DBG|received request revalidator/purge[], id=0 ... 2023-10-09T14:44:28.742Z|00125|dpif|DBG|system@ovs-system: flow_del ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 recirc_id(0),dp_hash(0), skb_priority(0),in_port(2),skb_mark(0),ct_state(0),ct_zone(0), ct_mark(0),ct_label(0),eth(src=a6:0a:bf:e2:f3:f2, dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=10.1.1.1, tip=10.1.1.2,op=1,sha=a6:0a:bf:e2:f3:f2,tha=00:00:00:00:00:00), packets:0, bytes:0, used:never ... 2023-10-09T14:44:28.742Z|00129|unixctl|DBG|replying with success, id=0: "" ... 2023-10-09T14:44:28.742Z|00006|dpif(revalidator21)|DBG|system@ovs-system: flow_dump ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 , packets:0, bytes:0, used:never ... 2023-10-09T14:44:28.742Z|00012|dpif(revalidator21)|WARN|system@ovs-system: failed to flow_get (No such file or directory) ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 , packets:0, bytes:0, used:never 2023-10-09T14:44:28.742Z|00013|ofproto_dpif_upcall(revalidator21)|WARN| Failed to acquire udpif_key corresponding to unexpected flow (No such file or directory): ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 To avoid evaluating already deleted flows, the second part of the fix is to ensure that dumping from the leader revalidator thread is done out of any pause request. As a result of this patch, the unit test "offloads - delete ufid mapping if device not exist - offloads enabled" does not need to waive the random warning logs when purging dp flows. Fixes: 98bb4286970d ("tests: Add command to purge revalidators of flows.") Acked-by: Eelco Chaudron Signed-off-by: David Marchand Acked-by: Eelco Chaudron Acked-by: Simon Horman --- Changes since v1: - updated commitlog, - added check-offloads unit test update, --- ofproto/ofproto-dpif-upcall.c | 17 +++++++++++++++-- tests/system-offloads-traffic.at | 2 -- 2 files changed, 15 insertions(+), 4 deletions(-) diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index cde03abc6d..e49e9f7e50 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -990,7 +990,7 @@ udpif_revalidator(void *arg) udpif->reval_exit = latch_is_set(&udpif->exit_latch); start_time = time_msec(); - if (!udpif->reval_exit) { + if (!udpif->reval_exit && !udpif->pause) { bool terse_dump; terse_dump = udpif_use_ufid(udpif); @@ -1000,10 +1000,15 @@ udpif_revalidator(void *arg) } } - /* Wait for the leader to start the flow dump. */ + /* Wait for the leader to reach this point. */ ovs_barrier_block(&udpif->reval_barrier); if (udpif->pause) { revalidator_pause(revalidator); + if (!udpif->reval_exit) { + /* The main thread resumed all validators, but the leader + * dumped nothing, go to next iteration. */ + continue; + } } if (udpif->reval_exit) { @@ -3217,11 +3222,19 @@ upcall_unixctl_purge(struct unixctl_conn *conn, int argc OVS_UNUSED, struct udpif *udpif; LIST_FOR_EACH (udpif, list_node, &all_udpifs) { + bool wake_up = false; int n; + if (!latch_is_set(&udpif->pause_latch)) { + udpif_pause_revalidators(udpif); + wake_up = true; + } for (n = 0; n < udpif->n_revalidators; n++) { revalidator_purge(&udpif->revalidators[n]); } + if (wake_up) { + udpif_resume_revalidators(udpif); + } } unixctl_command_reply(conn, ""); } diff --git a/tests/system-offloads-traffic.at b/tests/system-offloads-traffic.at index 81f3dc8c1e..e9a4587653 100644 --- a/tests/system-offloads-traffic.at +++ b/tests/system-offloads-traffic.at @@ -799,8 +799,6 @@ AT_CHECK([test $(ovs-appctl dpctl/dump-flows | grep -c "eth_type(0x0800)") -eq 0 OVS_TRAFFIC_VSWITCHD_STOP(["/could not open network device ovs-p0/d /on nonexistent port/d -/failed to flow_get/d -/Failed to acquire udpif_key/d /No such device/d /failed to offload flow/d "])