diff mbox series

[nf] netfilter: nf_nat: skip nat clash resolution for same-origin entries

Message ID 20190129145142.11378-1-fw@strlen.de
State Accepted
Delegated to: Pablo Neira
Headers show
Series [nf] netfilter: nf_nat: skip nat clash resolution for same-origin entries | expand

Commit Message

Florian Westphal Jan. 29, 2019, 2:51 p.m. UTC
From: Martynas Pumputis <martynas@weave.works>

It is possible that two concurrent packets originating from the same
socket of a connection-less protocol (e.g. UDP) can end up having
different IP_CT_DIR_REPLY tuples which results in one of the packets
being dropped.

To illustrate this, consider the following simplified scenario:

1. Packet A and B are sent at the same time from two different threads
   by same UDP socket.  No matching conntrack entry exists yet.
   Both packets cause allocation of a new conntrack entry.
2. get_unique_tuple gets called for A.  No clashing entry found.
   conntrack entry for A is added to main conntrack table.
3. get_unique_tuple is called for B and will find that the reply
   tuple of B is already taken by A.
   It will allocate a new UDP source port for B to resolve the clash.
4. conntrack entry for B cannot be added to main conntrack table
   because its ORIGINAL direction is clashing with A and the REPLY
   directions of A and B are not the same anymore due to UDP source
   port reallocation done in step 3.

This patch modifies nf_conntrack_tuple_taken so it doesn't consider
colliding reply tuples if the IP_CT_DIR_ORIGINAL tuples are equal.

[ Florian: simplify patch to not use .allow_clash setting
  and always ignore identical flows ]

Signed-off-by: Martynas Pumputis <martynas@weave.works>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nf_conntrack_core.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Pablo Neira Ayuso Feb. 4, 2019, 1:24 p.m. UTC | #1
On Tue, Jan 29, 2019 at 03:51:42PM +0100, Florian Westphal wrote:
> From: Martynas Pumputis <martynas@weave.works>
> 
> It is possible that two concurrent packets originating from the same
> socket of a connection-less protocol (e.g. UDP) can end up having
> different IP_CT_DIR_REPLY tuples which results in one of the packets
> being dropped.
> 
> To illustrate this, consider the following simplified scenario:
> 
> 1. Packet A and B are sent at the same time from two different threads
>    by same UDP socket.  No matching conntrack entry exists yet.
>    Both packets cause allocation of a new conntrack entry.
> 2. get_unique_tuple gets called for A.  No clashing entry found.
>    conntrack entry for A is added to main conntrack table.
> 3. get_unique_tuple is called for B and will find that the reply
>    tuple of B is already taken by A.
>    It will allocate a new UDP source port for B to resolve the clash.
> 4. conntrack entry for B cannot be added to main conntrack table
>    because its ORIGINAL direction is clashing with A and the REPLY
>    directions of A and B are not the same anymore due to UDP source
>    port reallocation done in step 3.
> 
> This patch modifies nf_conntrack_tuple_taken so it doesn't consider
> colliding reply tuples if the IP_CT_DIR_ORIGINAL tuples are equal.
> 
> [ Florian: simplify patch to not use .allow_clash setting
>   and always ignore identical flows ]

I prefer this band aid remains small indeed.

Applied, thanks Florian.
diff mbox series

Patch

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 741b533148ba..db4d46332e86 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1007,6 +1007,22 @@  nf_conntrack_tuple_taken(const struct nf_conntrack_tuple *tuple,
 		}
 
 		if (nf_ct_key_equal(h, tuple, zone, net)) {
+			/* Tuple is taken already, so caller will need to find
+			 * a new source port to use.
+			 *
+			 * Only exception:
+			 * If the *original tuples* are identical, then both
+			 * conntracks refer to the same flow.
+			 * This is a rare situation, it can occur e.g. when
+			 * more than one UDP packet is sent from same socket
+			 * in different threads.
+			 *
+			 * Let nf_ct_resolve_clash() deal with this later.
+			 */
+			if (nf_ct_tuple_equal(&ignored_conntrack->tuplehash[IP_CT_DIR_ORIGINAL].tuple,
+					      &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple))
+				continue;
+
 			NF_CT_STAT_INC_ATOMIC(net, found);
 			rcu_read_unlock();
 			return 1;