From patchwork Tue Aug 6 20:23:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yifeng Sun X-Patchwork-Id: 1143048 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="FQVfhLsI"; dkim-atps=neutral Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4635gZ1HQDz9sNF for ; Wed, 7 Aug 2019 06:23:37 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 2B316D7C; Tue, 6 Aug 2019 20:23:35 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id EF1F8D48 for ; Tue, 6 Aug 2019 20:23:33 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-pl1-f193.google.com (mail-pl1-f193.google.com [209.85.214.193]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 7C75C4C3 for ; Tue, 6 Aug 2019 20:23:33 +0000 (UTC) Received: by mail-pl1-f193.google.com with SMTP id w24so38316796plp.2 for ; Tue, 06 Aug 2019 13:23:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=PCjmpg+hLw9BdOwerZ7F2GjId2ojmeV5Ah4khF16yGg=; b=FQVfhLsIce7/Nm7z77Vx/ekONVJE1T5UiS8QEZOtK6q1bCNcFDnTR0c9YJlEIVADRG u+AnQSTgZVC2R4Z+IyHJrPIv6Udz6a+ywTMug2jYNtJk0GTYe6mrqx2ekG2/WHRvZTUr zBP2Y6cy/K23t37so2ShoccN/bJSB8475uILxF69rVuVBpEt9ubJvJecGqmFMy9Z3tIu WiAv2NT0eEgF1POiazl8U16iJ2woSGMF/ANzWi47Zj4pn3R5SNM7xgq6M+Q1GsHFRO4g AhCji/XMBZTqH8KGypxKIb56uFLrmSE+tKRTvLlGg9ixS4nk64bt+Hc8A8qgueVFYvi/ mPRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=PCjmpg+hLw9BdOwerZ7F2GjId2ojmeV5Ah4khF16yGg=; b=MiYpowhCML0k9k0dhqW4p64Z0X2R+gZCZtDXWqKgAUjnoGjxHlFeZ7/L7mjtxtspMI TLeU2oetnQxlx7FSKVuRLqOnSAoqQwggaMXUUgx9LTmIXNl3AQkmXFa77r+3K7B33sVl Rr08RGrrk6nz9pfCuJjgnTii1HcgHZKlRZujAA2Uu125EgCLTAHSHqwZI0XMEzEkI5r/ aFsGyQIRvsJidxBZnMMZqi9wQS4TncuGcVWc1YUI5FPN9z43iAfqdonB6NUdiJdbsuwu 7YlMph8n952+pOQ2KJKAPh/ur/pBHAL0XK72xEG6A8dAS1G3oBHFwJUElzVkqdYwp67b j0ow== X-Gm-Message-State: APjAAAUlItMODOFXitoLvzDHxLOBjg9LNzQuEPeGkuh68IDejqmXLb7p zdVrR7gLgLbFjdW03LOR50czEP4reis= X-Google-Smtp-Source: APXvYqwGpY5gr3m2P25DWFvlppe1tz/bT+r98A5tlD0wrp/oXrNkLSkNGs6mdEXBpTFLcpfrK5UBoA== X-Received: by 2002:a17:902:f087:: with SMTP id go7mr4851316plb.330.1565123012579; Tue, 06 Aug 2019 13:23:32 -0700 (PDT) Received: from kern417.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id h6sm89393152pfb.20.2019.08.06.13.23.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 06 Aug 2019 13:23:31 -0700 (PDT) From: Yifeng Sun To: dev@openvswitch.org Date: Tue, 6 Aug 2019 13:23:26 -0700 Message-Id: <1565123006-20430-1-git-send-email-pkusunyifeng@gmail.com> X-Mailer: git-send-email 2.7.4 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH] datapath: compat: Backports bugfixes for nf_conncount X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This patch backports several critical bug fixes related to locking and data consistency in nf_conncount code. This backport is based on the following upstream net-next upstream commits. d4e7df1 ("netfilter: nf_conncount: use rb_link_node_rcu() instead of rb_link_node()") 53ca0f2 ("netfilter: nf_conncount: remove wrong condition check routine") 3c5cdb1 ("netfilter: nf_conncount: fix unexpected permanent node of list.") 31568ec ("netfilter: nf_conncount: fix list_del corruption in conn_free") fd3e71a ("netfilter: nf_conncount: use spin_lock_bh instead of spin_lock") VMware-BZ: #2396471 CC: Taehee Yoo Signed-off-by: Yifeng Sun --- datapath/linux/compat/nf_conncount.c | 54 ++++++++++++++++++++++-------------- 1 file changed, 33 insertions(+), 21 deletions(-) diff --git a/datapath/linux/compat/nf_conncount.c b/datapath/linux/compat/nf_conncount.c index eeae440f872d..6e4f368b9389 100644 --- a/datapath/linux/compat/nf_conncount.c +++ b/datapath/linux/compat/nf_conncount.c @@ -49,12 +49,13 @@ /* we will save the tuples of all connections we care about */ struct nf_conncount_tuple { - struct list_head node; + struct list_head node; struct nf_conntrack_tuple tuple; struct nf_conntrack_zone zone; - int cpu; - u32 jiffies32; - struct rcu_head rcu_head; + int cpu; + u32 jiffies32; + bool dead; + struct rcu_head rcu_head; }; struct nf_conncount_rb { @@ -111,15 +112,16 @@ nf_conncount_add(struct nf_conncount_list *list, conn->zone = *zone; conn->cpu = raw_smp_processor_id(); conn->jiffies32 = (u32)jiffies; - spin_lock(&list->list_lock); + conn->dead = false; + spin_lock_bh(&list->list_lock); if (list->dead == true) { kmem_cache_free(conncount_conn_cachep, conn); - spin_unlock(&list->list_lock); + spin_unlock_bh(&list->list_lock); return NF_CONNCOUNT_SKIP; } list_add_tail(&conn->node, &list->head); list->count++; - spin_unlock(&list->list_lock); + spin_unlock_bh(&list->list_lock); return NF_CONNCOUNT_ADDED; } @@ -136,19 +138,22 @@ static bool conn_free(struct nf_conncount_list *list, { bool free_entry = false; - spin_lock(&list->list_lock); + spin_lock_bh(&list->list_lock); - if (list->count == 0) { - spin_unlock(&list->list_lock); + if (conn->dead) { + spin_unlock_bh(&list->list_lock); return free_entry; } list->count--; + conn->dead = true; list_del_rcu(&conn->node); - if (list->count == 0) + if (list->count == 0) { + list->dead = true; free_entry = true; + } - spin_unlock(&list->list_lock); + spin_unlock_bh(&list->list_lock); call_rcu(&conn->rcu_head, __conn_free); return free_entry; } @@ -160,7 +165,7 @@ find_or_evict(struct net *net, struct nf_conncount_list *list, const struct nf_conntrack_tuple_hash *found; unsigned long a, b; int cpu = raw_smp_processor_id(); - __s32 age; + u32 age; found = nf_conntrack_find_get(net, &conn->zone, &conn->tuple); if (found) @@ -248,7 +253,7 @@ static void nf_conncount_list_init(struct nf_conncount_list *list) { spin_lock_init(&list->list_lock); INIT_LIST_HEAD(&list->head); - list->count = 1; + list->count = 0; list->dead = false; } @@ -261,6 +266,7 @@ static bool nf_conncount_gc_list(struct net *net, struct nf_conn *found_ct; unsigned int collected = 0; bool free_entry = false; + bool ret = false; list_for_each_entry_safe(conn, conn_n, &list->head, node) { found = find_or_evict(net, list, conn, &free_entry); @@ -290,7 +296,15 @@ static bool nf_conncount_gc_list(struct net *net, if (collected > CONNCOUNT_GC_MAX_NODES) return false; } - return false; + + spin_lock_bh(&list->list_lock); + if (!list->count) { + list->dead = true; + ret = true; + } + spin_unlock_bh(&list->list_lock); + + return ret; } static void __tree_nodes_free(struct rcu_head *h) @@ -310,11 +324,8 @@ static void tree_nodes_free(struct rb_root *root, while (gc_count) { rbconn = gc_nodes[--gc_count]; spin_lock(&rbconn->list.list_lock); - if (rbconn->list.count == 0 && rbconn->list.dead == false) { - rbconn->list.dead = true; - rb_erase(&rbconn->node, root); - call_rcu(&rbconn->rcu_head, __tree_nodes_free); - } + rb_erase(&rbconn->node, root); + call_rcu(&rbconn->rcu_head, __tree_nodes_free); spin_unlock(&rbconn->list.list_lock); } } @@ -415,8 +426,9 @@ insert_tree(struct net *net, nf_conncount_list_init(&rbconn->list); list_add(&conn->node, &rbconn->list.head); count = 1; + rbconn->list.count = count; - rb_link_node(&rbconn->node, parent, rbnode); + rb_link_node_rcu(&rbconn->node, parent, rbnode); rb_insert_color(&rbconn->node, root); out_unlock: spin_unlock_bh(&nf_conncount_locks[hash % CONNCOUNT_LOCK_SLOTS]);