[v2] netfilter: xt_connlimit: fix race in connection counting
diff mbox series

Message ID 20190103000742.GA108711@dev-dsk-alakeshh-2c-f8a3e6e0.us-west-2.amazon.com
State Awaiting Upstream
Delegated to: David Miller
Headers show
Series
  • [v2] netfilter: xt_connlimit: fix race in connection counting
Related show

Commit Message

Alakesh Haloi Jan. 3, 2019, 12:07 a.m. UTC
An iptable rule like the following on a multicore systems will result in
accepting more connections than set in the rule.

iptables  -A INPUT -p tcp -m tcp --syn --dport 7777 -m connlimit \
      --connlimit-above 2000 --connlimit-mask 0 -j DROP

In check_hlist function, connections that are found in saved connections
but not in netfilter conntrack are deleted, assuming that those
connections do not exist anymore. But for multi core systems, there exists
a small time window, when a connection has been added to the xt_connlimit
maintained rb-tree but has not yet made to netfilter conntrack table. This
causes concurrent connections to return incorrect counts and go over limit
set in iptable rule.

The fix has been partially backported from the above mentioned upstream
commit. Introduce timestamp and the owning cpu.

Signed-off-by: Alakesh Haloi <alakeshh@amazon.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: Florian Westphal <fw@strlen.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: stable@vger.kernel.org # v4.15 and before
Cc: netdev@vger.kernel.org
Cc: Dmitry Andrianov <dmitry.andrianov@alertme.com>
Cc: Justin Pettit <jpettit@vmware.com>
Cc: Yi-Hung Wei <yihung.wei@gmail.com>
---
 net/netfilter/xt_connlimit.c | 28 ++++++++++++++++++++++++++--
 1 file changed, 26 insertions(+), 2 deletions(-)

Patch
diff mbox series

diff --git a/net/netfilter/xt_connlimit.c b/net/netfilter/xt_connlimit.c
index ffa8eec..e7b092b 100644
--- a/net/netfilter/xt_connlimit.c
+++ b/net/netfilter/xt_connlimit.c
@@ -47,6 +47,8 @@  struct xt_connlimit_conn {
 	struct hlist_node		node;
 	struct nf_conntrack_tuple	tuple;
 	union nf_inet_addr		addr;
+	int				cpu;
+	u32				jiffies32;
 };
 
 struct xt_connlimit_rb {
@@ -126,6 +128,8 @@  static bool add_hlist(struct hlist_head *head,
 		return false;
 	conn->tuple = *tuple;
 	conn->addr = *addr;
+	conn->cpu = raw_smp_processor_id();
+	conn->jiffies32 = (u32)jiffies;
 	hlist_add_head(&conn->node, head);
 	return true;
 }
@@ -148,8 +152,26 @@  static unsigned int check_hlist(struct net *net,
 	hlist_for_each_entry_safe(conn, n, head, node) {
 		found = nf_conntrack_find_get(net, zone, &conn->tuple);
 		if (found == NULL) {
-			hlist_del(&conn->node);
-			kmem_cache_free(connlimit_conn_cachep, conn);
+			/* If connection is not found, it may be because
+			 * it has not made into conntrack table yet. We
+			 * check if it is a recently created connection
+			 * on a different core and do not delete it in that
+			 * case.
+			 */
+
+			unsigned long a, b;
+			int cpu = raw_smp_processor_id();
+			__u32 age;
+
+			b = conn->jiffies;
+			a = (u32)jiffies;
+			age = a - b;
+			if (conn->cpu != cpu && age <= 2) {
+				length++;
+			} else {
+				hlist_del(&conn->node);
+				kmem_cache_free(connlimit_conn_cachep, conn);
+			}
 			continue;
 		}
 
@@ -271,6 +293,8 @@  static void tree_nodes_free(struct rb_root *root,
 
 	conn->tuple = *tuple;
 	conn->addr = *addr;
+	conn->cpu = raw_smp_processor_id();
+	conn->jiffies32 = (u32)jiffies;
 	rbconn->addr = *addr;
 
 	INIT_HLIST_HEAD(&rbconn->hhead);