From patchwork Mon Jan 14 18:55:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 1024743 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43djP20m1lz9sN9; Tue, 15 Jan 2019 05:56:21 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1gj7PD-0005G0-Vn; Mon, 14 Jan 2019 18:56:15 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1gj7P9-0005DB-DF for kernel-team@lists.ubuntu.com; Mon, 14 Jan 2019 18:56:11 +0000 Received: from mail-qt1-f199.google.com ([209.85.160.199]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1gj7P9-00014z-3C for kernel-team@lists.ubuntu.com; Mon, 14 Jan 2019 18:56:11 +0000 Received: by mail-qt1-f199.google.com with SMTP id u32so104362qte.1 for ; Mon, 14 Jan 2019 10:56:11 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=yIK+3OeJh8TmZ6Bdw1uwQbsPOcX3z79VAN26Bybktw4=; b=TofF+g7f7yGmLPVhVEz7T9iCvR++aAvXc+KAALi/kHip+WDzFFuwtXUhnbsl8m2ciN bASWOxIgQJaEgQl1ksmDgcFvHw47BPhqu5griLrztdFJe7L3D4anhpHcjAO/G0G/VQmG h8DZZ0GsGScsyFbugERNBTt65hJ0NuWke0otyQCFkmW6qpx+8+5J7S1pmiQiaFQrS3Z9 wWzAvKtxz0hwEutrBYfIWhpvrHC4h6iXcaukCK031JX9lnjfeZL3IvFftR6l7Z5pyXho mUGMkIBqM+B4mrRZh7gGK9/MgevfqUQfMmyQNyFk66ctiQFwXM5sUiZu75mJhr0JCs8Q ecKg== X-Gm-Message-State: AJcUukfRrReLR+pPfonBWxiQv/wuJtQLzbOc0kyBAVW982U6el9fnnl5 NiMlZqwVYTaUMIYsrF33XkAbCNCq6qO2goTarTDi/JUoxOR/xcm0gMZghMUjW711D9pqCIex/SZ JEbifjUuaCkpNjdUcZCXFExERZduQdOR48OpilLJM2Q== X-Received: by 2002:ac8:6c3:: with SMTP id j3mr24507527qth.84.1547492170049; Mon, 14 Jan 2019 10:56:10 -0800 (PST) X-Google-Smtp-Source: ALg8bN7fMDtelc6JmsImg78+a7hTNtqXRAaKmqsuIR+rTZYQQHyk1uu8wic4WH+xJxUL7ywEHv3pYQ== X-Received: by 2002:ac8:6c3:: with SMTP id j3mr24507516qth.84.1547492169815; Mon, 14 Jan 2019 10:56:09 -0800 (PST) Received: from localhost.localdomain ([177.181.227.2]) by smtp.gmail.com with ESMTPSA id d50sm54446935qta.31.2019.01.14.10.56.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 14 Jan 2019 10:56:09 -0800 (PST) From: Mauricio Faria de Oliveira To: kernel-team@lists.ubuntu.com Subject: [SRU T][PATCH 2/3] netfilter: nf_conncount: fix garbage collection confirm race Date: Mon, 14 Jan 2019 16:55:21 -0200 Message-Id: <20190114185522.10533-3-mfo@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190114185522.10533-1-mfo@canonical.com> References: <20190114185522.10533-1-mfo@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Florian Westphal BugLink: https://bugs.launchpad.net/bugs/1811094 Yi-Hung Wei and Justin Pettit found a race in the garbage collection scheme used by nf_conncount. When doing list walk, we lookup the tuple in the conntrack table. If the lookup fails we remove this tuple from our list because the conntrack entry is gone. This is the common cause, but turns out its not the only one. The list entry could have been created just before by another cpu, i.e. the conntrack entry might not yet have been inserted into the global hash. The avoid this, we introduce a timestamp and the owning cpu. If the entry appears to be stale, evict only if: 1. The current cpu is the one that added the entry, or, 2. The timestamp is older than two jiffies The second constraint allows GC to be taken over by other cpu too (e.g. because a cpu was offlined or napi got moved to another cpu). We can't pretend the 'doubtful' entry wasn't in our list. Instead, when we don't find an entry indicate via IS_ERR that entry was removed ('did not exist' or withheld ('might-be-unconfirmed'). This most likely also fixes a xt_connlimit imbalance earlier reported by Dmitry Andrianov. Cc: Dmitry Andrianov Reported-by: Justin Pettit Reported-by: Yi-Hung Wei Signed-off-by: Florian Westphal Acked-by: Yi-Hung Wei Signed-off-by: Pablo Neira Ayuso (backported from commit b36e4523d4d56e2595e28f16f6ccf1cd6a9fc452) [mfo: backport: refresh context lines and use older symbol/file names: - nf_conncount.c -> xt_connlimit.c. - nf_conncount_rb -> xt_connlimit_rb - nf_conncount_tuple -> xt_connlimit_conn - conncount_conn_cachep -> connlimit_conn_cachep] - hunk 1: - refresh context lines; struct xt_connlimit_conn has the 'addr' field (and it's used), and doesn't have 'zone'. - hunk 2, part 1 (assignments) -> hunk 5: - not in its own function (add_hlist()/nf_conncount_add()), but in count_them(); refresh context lines/indentation. - hunk 2, part 2 (find_or_evict()) -> hunk 2: - s/&conn->zone/NF_CT_DEFAULT_ZONE/ as in the removed chunk, due to lack of commit e59ea3df3fc2 ("netfilter: xt_connlimit: honor conntrack zone if available"); - s/kmem_cache_free()/kfree()/ as in the removed chunk. - hunk 3, part 1 (move line) -> hunk 3: - refresh context lines. - hunk 3, part 2 (remove/add) -> hunk 4: - s/head/hash/ in the hlist_for_each_entry_safe() call due to lack of commit 15cfd5289575 ("netfilter: connlimit: factor hlist search into new function")] - s/&conn->zone/NF_CT_DEFAULT_ZONE/ due to lack of commit e59ea3df3fc2 ("netfilter: xt_connlimit: honor conntrack zone if available"); - s/kmem_cache_free()/kfree()/ - s/length/matches/ due ot lack of commit 7d08487777c8 ("netfilter: connlimit: use rbtree for per-host conntrack obj storage") - remove the zone-related addition lines (thus remove the 'tuple' check as well), as zones are not used yet (as mentioned above).] Signed-off-by: Mauricio Faria de Oliveira --- net/netfilter/xt_connlimit.c | 45 +++++++++++++++++++++++++++++++----- 1 file changed, 39 insertions(+), 6 deletions(-) diff --git a/net/netfilter/xt_connlimit.c b/net/netfilter/xt_connlimit.c index 5d18f39ad69b..df33ca3bc42f 100644 --- a/net/netfilter/xt_connlimit.c +++ b/net/netfilter/xt_connlimit.c @@ -36,6 +36,8 @@ struct xt_connlimit_conn { struct hlist_node node; struct nf_conntrack_tuple tuple; union nf_inet_addr addr; + int cpu; + u32 jiffies32; }; struct xt_connlimit_data { @@ -92,6 +94,35 @@ same_source_net(const union nf_inet_addr *addr, } } +static const struct nf_conntrack_tuple_hash * +find_or_evict(struct net *net, struct xt_connlimit_conn *conn) +{ + const struct nf_conntrack_tuple_hash *found; + unsigned long a, b; + int cpu = raw_smp_processor_id(); + __s32 age; + + found = nf_conntrack_find_get(net, NF_CT_DEFAULT_ZONE, &conn->tuple); + if (found) + return found; + b = conn->jiffies32; + a = (u32)jiffies; + + /* conn might have been added just before by another cpu and + * might still be unconfirmed. In this case, nf_conntrack_find() + * returns no result. Thus only evict if this cpu added the + * stale entry or if the entry is older than two jiffies. + */ + age = a - b; + if (conn->cpu == cpu || age >= 2) { + hlist_del(&conn->node); + kfree(conn); + return ERR_PTR(-ENOENT); + } + + return ERR_PTR(-EAGAIN); +} + static int count_them(struct net *net, struct xt_connlimit_data *data, const struct nf_conntrack_tuple *tuple, @@ -101,8 +132,8 @@ static int count_them(struct net *net, { const struct nf_conntrack_tuple_hash *found; struct xt_connlimit_conn *conn; - struct hlist_node *n; struct nf_conn *found_ct; + struct hlist_node *n; struct hlist_head *hash; bool addit = true; int matches = 0; @@ -116,11 +147,11 @@ static int count_them(struct net *net, /* check the saved connections */ hlist_for_each_entry_safe(conn, n, hash, node) { - found = nf_conntrack_find_get(net, NF_CT_DEFAULT_ZONE, - &conn->tuple); - if (found == NULL) { - hlist_del(&conn->node); - kfree(conn); + found = find_or_evict(net, conn); + if (IS_ERR(found)) { + /* Not found, but might be about to be confirmed */ + if (PTR_ERR(found) == -EAGAIN) + matches++; continue; } @@ -159,6 +190,8 @@ static int count_them(struct net *net, return -ENOMEM; conn->tuple = *tuple; conn->addr = *addr; + conn->cpu = raw_smp_processor_id(); + conn->jiffies32 = (u32)jiffies; hlist_add_head(&conn->node, hash); ++matches; }