diff mbox

netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get (v3)

Message ID 1389549033-23523-1-git-send-email-avagin@openvz.org
State Accepted
Headers show

Commit Message

Andrei Vagin Jan. 12, 2014, 5:50 p.m. UTC
Lets look at destroy_conntrack:

hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
...
nf_conntrack_free(ct)
	kmem_cache_free(net->ct.nf_conntrack_cachep, ct);

net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.

The hash is protected by rcu, so readers look up conntracks without
locks.
A conntrack is removed from the hash, but in this moment a few readers
still can use the conntrack. Then this conntrack is released and another
thread creates conntrack with the same address and the equal tuple.
After this a reader starts to validate the conntrack:
* It's not dying, because a new conntrack was created
* nf_ct_tuple_equal() returns true.

But this conntrack is not initialized yet, so it can not be used by two
threads concurrently. In this case BUG_ON may be triggered from
nf_nat_setup_info().

Florian Westphal suggested to check the confirm bit too. I think it's
right.

task 1			task 2			task 3
			nf_conntrack_find_get
			 ____nf_conntrack_find
destroy_conntrack
 hlist_nulls_del_rcu
 nf_conntrack_free
 kmem_cache_free
						__nf_conntrack_alloc
						 kmem_cache_alloc
						 memset(&ct->tuplehash[IP_CT_DIR_MAX],
			 if (nf_ct_is_dying(ct))
			 if (!nf_ct_tuple_equal()

I'm not sure, that I have ever seen this race condition in a real life.
Currently we are investigating a bug, which is reproduced on a few nodes.
In our case one conntrack is initialized from a few tasks concurrently,
we don't have any other explanation for this.

<2>[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322!
...
<4>[46267.083951] RIP: 0010:[<ffffffffa01e00a4>]  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590 [nf_nat]
...
<4>[46267.085549] Call Trace:
<4>[46267.085622]  [<ffffffffa023421b>] alloc_null_binding+0x5b/0xa0 [iptable_nat]
<4>[46267.085697]  [<ffffffffa02342bc>] nf_nat_rule_find+0x5c/0x80 [iptable_nat]
<4>[46267.085770]  [<ffffffffa0234521>] nf_nat_fn+0x111/0x260 [iptable_nat]
<4>[46267.085843]  [<ffffffffa0234798>] nf_nat_out+0x48/0xd0 [iptable_nat]
<4>[46267.085919]  [<ffffffff814841b9>] nf_iterate+0x69/0xb0
<4>[46267.085991]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086063]  [<ffffffff81484374>] nf_hook_slow+0x74/0x110
<4>[46267.086133]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086207]  [<ffffffff814b5890>] ? dst_output+0x0/0x20
<4>[46267.086277]  [<ffffffff81495204>] ip_output+0xa4/0xc0
<4>[46267.086346]  [<ffffffff814b65a4>] raw_sendmsg+0x8b4/0x910
<4>[46267.086419]  [<ffffffff814c10fa>] inet_sendmsg+0x4a/0xb0
<4>[46267.086491]  [<ffffffff814459aa>] ? sock_update_classid+0x3a/0x50
<4>[46267.086562]  [<ffffffff81444d67>] sock_sendmsg+0x117/0x140
<4>[46267.086638]  [<ffffffff8151997b>] ? _spin_unlock_bh+0x1b/0x20
<4>[46267.086712]  [<ffffffff8109d370>] ? autoremove_wake_function+0x0/0x40
<4>[46267.086785]  [<ffffffff81495e80>] ? do_ip_setsockopt+0x90/0xd80
<4>[46267.086858]  [<ffffffff8100be0e>] ? call_function_interrupt+0xe/0x20
<4>[46267.086936]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087006]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087081]  [<ffffffff8118f2e8>] ? kmem_cache_alloc+0xd8/0x1e0
<4>[46267.087151]  [<ffffffff81445599>] sys_sendto+0x139/0x190
<4>[46267.087229]  [<ffffffff81448c0d>] ? sock_setsockopt+0x16d/0x6f0
<4>[46267.087303]  [<ffffffff810efa47>] ? audit_syscall_entry+0x1d7/0x200
<4>[46267.087378]  [<ffffffff810ef795>] ? __audit_syscall_exit+0x265/0x290
<4>[46267.087454]  [<ffffffff81474885>] ? compat_sys_setsockopt+0x75/0x210
<4>[46267.087531]  [<ffffffff81474b5f>] compat_sys_socketcall+0x13f/0x210
<4>[46267.087607]  [<ffffffff8104dea3>] ia32_sysret+0x0/0x5
<4>[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 <0f> 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74
<1>[46267.088023] RIP  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590

v2: move nf_ct_is_confirmed into the unlikely() annotation
v3: Eric suggested to fix refcnt, so that it becomes zero before adding
    in a hash, but we can't find a way how to do that. Another way is to
    interpret the confirm bit as part of a search key and check it in
    ____nf_conntrack_find() too.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
---
 net/netfilter/nf_conntrack_core.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

Comments

Eric Dumazet Jan. 12, 2014, 8:21 p.m. UTC | #1
On Sun, 2014-01-12 at 21:50 +0400, Andrey Vagin wrote:
> Lets look at destroy_conntrack:
> 
> hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
> ...
> nf_conntrack_free(ct)
> 	kmem_cache_free(net->ct.nf_conntrack_cachep, ct);
> 
> net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.
> 
> The hash is protected by rcu, so readers look up conntracks without
> locks.
> A conntrack is removed from the hash, but in this moment a few readers
> still can use the conntrack. Then this conntrack is released and another
> thread creates conntrack with the same address and the equal tuple.
> After this a reader starts to validate the conntrack:
> * It's not dying, because a new conntrack was created
> * nf_ct_tuple_equal() returns true.
...


> v2: move nf_ct_is_confirmed into the unlikely() annotation
> v3: Eric suggested to fix refcnt, so that it becomes zero before adding
>     in a hash, but we can't find a way how to do that. Another way is to
>     interpret the confirm bit as part of a search key and check it in
>     ____nf_conntrack_find() too.
> 
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Florian Westphal <fw@strlen.de>
> Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> Signed-off-by: Andrey Vagin <avagin@openvz.org>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

Thanks Andrey !


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Vagin Jan. 14, 2014, 10:51 a.m. UTC | #2
On Sun, Jan 12, 2014 at 12:21:14PM -0800, Eric Dumazet wrote:
> On Sun, 2014-01-12 at 21:50 +0400, Andrey Vagin wrote:
> > Lets look at destroy_conntrack:
> > 
> > hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
> > ...
> > nf_conntrack_free(ct)
> > 	kmem_cache_free(net->ct.nf_conntrack_cachep, ct);
> > 
> > net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.
> > 
> > The hash is protected by rcu, so readers look up conntracks without
> > locks.
> > A conntrack is removed from the hash, but in this moment a few readers
> > still can use the conntrack. Then this conntrack is released and another
> > thread creates conntrack with the same address and the equal tuple.
> > After this a reader starts to validate the conntrack:
> > * It's not dying, because a new conntrack was created
> > * nf_ct_tuple_equal() returns true.
> ...
> 
> 
> > v2: move nf_ct_is_confirmed into the unlikely() annotation
> > v3: Eric suggested to fix refcnt, so that it becomes zero before adding
> >     in a hash, but we can't find a way how to do that. Another way is to
> >     interpret the confirm bit as part of a search key and check it in
> >     ____nf_conntrack_find() too.
> > 
> > Cc: Eric Dumazet <eric.dumazet@gmail.com>
> > Cc: Florian Westphal <fw@strlen.de>
> > Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> > Cc: Patrick McHardy <kaber@trash.net>
> > Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
> > Cc: "David S. Miller" <davem@davemloft.net>
> > Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> > Signed-off-by: Andrey Vagin <avagin@openvz.org>
> > ---
> 
> Acked-by: Eric Dumazet <edumazet@google.com>
> 
> Thanks Andrey !
> 

Eh, looks like this path is incomplete too:(

I think we can't set a reference counter for objects which is allocated
from a SLAB_DESTROY_BY_RCU cache. Look at the following backtrace.

cpu1					cpu2
ct = ____nf_conntrack_find()
					destroy_conntrack
atomic_inc_not_zero(ct)
					__nf_conntrack_alloc
					atomic_set(&ct->ct_general.use, 1);
if (!nf_ct_key_equal(h, tuple, zone))
nf_ct_put(ct);
  destroy_conntrack(ct) !!!!
					/* continues to use the conntrack */


Did I miss something again?

I think __nf_conntrack_alloc must use atomic_inc instead of
atomic_set. And we must be sure, that the first object from a new page is
zeroized.

I am talking about this, because after this patch a bug was triggered from
another place:

<2>[67096.759353] kernel BUG at net/netfilter/nf_conntrack_core.c:211!
<4>[67096.759371] invalid opcode: 0000 [#1] SMP 
<4>[67096.759385] last sysfs file: /sys/devices/virtual/block/md0/md/metadata_version
<4>[67096.759414] CPU 2 
<4>[67096.759422] Modules linked in: xt_comment sch_sfq cls_fw sch_htb pio_nfs pio_direct pfmt_raw pfmt_ploop1 ploop simfs xt_string xt_hashlimit xt_recent xt_length xt_hl xt_tcpmss xt_TCPMSS xt_multiport xt_limit xt_dscp vzevent coretemp cpufreq_ondemand acpi_cpufreq freq_table mperf 8021q garp stp llc ipt_REJECT iptable_filter iptable_mangle xt_NOTRACK iptable_raw iptable_nat ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state ip6table_filter ip6table_raw xt_MARK ip6table_mangle ip6_tables ext4 jbd2 tun ip_gre ipip vzethdev vznetdev vzrst nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ipv6 vzcpt nf_conntrack vzdquota vzmon vziolimit vzdev tunnel4 nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc tpm_tis tpm tpm_bios microcode serio_raw i2c_i801 sg iTCO_wdt iTCO_vendor_support e1000e ext3 jbd mbcache raid1 sd_mod crc_t10dif ata_piix ahci pata_acpi ata_generic i915 drm_kms_helper drm i2c_algo_bit i2c_core video output dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
<4>[67096.759801] 
<4>[67096.759837] Pid: 498649, comm: atdd veid: 666 Tainted: G         C ---------------    2.6.32-042stab084.18 #1 042stab084_18                  /DQ45CB
<4>[67096.759932] RIP: 0010:[<ffffffffa03d99ac>]  [<ffffffffa03d99ac>] destroy_conntrack+0x15c/0x190 [nf_conntrack]
<4>[67096.760032] RSP: 0000:ffff88001ae378b8  EFLAGS: 00010246
<4>[67096.760075] RAX: 0000000000000000 RBX: ffff8801a57ac928 RCX: 0000000000065000
<4>[67096.760123] RDX: 000000000000f603 RSI: 0000000000000006 RDI: ffff8801a57ac928
<4>[67096.760174] RBP: ffff88001ae378d8 R08: 0000000000000002 R09: ffff8802373b06e0
<4>[67096.760221] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88023928c080
<4>[67096.760255] R13: ffff880237e8c000 R14: 0000000000000002 R15: 0000000000000002
<4>[67096.760255] FS:  0000000000000000(0000) GS:ffff880028300000(0063) knlGS:00000000b63afbb0
<4>[67096.760255] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
<4>[67096.760255] CR2: 00000000b74f44c0 CR3: 00000000b89c6000 CR4: 00000000000007e0
<4>[67096.760255] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[67096.760255] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>[67096.760255] Process atdd (pid: 498649, veid: 666, threadinfo ffff88001ae36000, task ffff88001deaa980)
<4>[67096.760255] Stack:
<4>[67096.760255]  ffff88001ae378e8 ffff88001ae37988 ffff88023928c080 0000000000000003
<4>[67096.760255] <d> ffff88001ae378e8 ffffffff814844a7 ffff88001ae37908 ffffffffa03d9bb5
<4>[67096.760255] <d> ffff88012dcae580 ffff88023928c080 ffff88001ae379e8 ffffffffa03d9fb2
<4>[67096.760255] Call Trace:
<4>[67096.760255]  [<ffffffff814844a7>] nf_conntrack_destroy+0x17/0x30
<4>[67096.760255]  [<ffffffffa03d9bb5>] nf_conntrack_find_get+0x85/0x130 [nf_conntrack]
<4>[67096.760255]  [<ffffffffa03d9fb2>] nf_conntrack_in+0x352/0xb60 [nf_conntrack]
<4>[67096.760255]  [<ffffffffa048c771>] ipv4_conntrack_local+0x51/0x60 [nf_conntrack_ipv4]
<4>[67096.760255]  [<ffffffff81484419>] nf_iterate+0x69/0xb0
<4>[67096.760255]  [<ffffffff814b5b00>] ? dst_output+0x0/0x20
<4>[67096.760255]  [<ffffffff814845d4>] nf_hook_slow+0x74/0x110
<4>[67096.760255]  [<ffffffff814b5b00>] ? dst_output+0x0/0x20
<4>[67096.760255]  [<ffffffff814b66d5>] raw_sendmsg+0x775/0x910
<4>[67096.760255]  [<ffffffff8104c5a8>] ? flush_tlb_others_ipi+0x128/0x130
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff814c136a>] inet_sendmsg+0x4a/0xb0
<4>[67096.760255]  [<ffffffff81444e93>] ? sock_sendmsg+0x13/0x140
<4>[67096.760255]  [<ffffffff81444f97>] sock_sendmsg+0x117/0x140
<4>[67096.760255]  [<ffffffff8102e299>] ? native_smp_send_reschedule+0x49/0x60
<4>[67096.760255]  [<ffffffff81519beb>] ? _spin_unlock_bh+0x1b/0x20
<4>[67096.760255]  [<ffffffff8109d930>] ? autoremove_wake_function+0x0/0x40
<4>[67096.760255]  [<ffffffff814960f0>] ? do_ip_setsockopt+0x90/0xd80
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
<4>[67096.760255]  [<ffffffff814457c9>] sys_sendto+0x139/0x190
<4>[67096.760255]  [<ffffffff810efa77>] ? audit_syscall_entry+0x1d7/0x200
<4>[67096.760255]  [<ffffffff810ef7c5>] ? __audit_syscall_exit+0x265/0x290
<4>[67096.760255]  [<ffffffff81474daf>] compat_sys_socketcall+0x13f/0x210
<4>[67096.760255]  [<ffffffff8104dea3>] ia32_sysret+0x0/0x5
<4>[67096.760255] Code: 0b ab 0a e1 eb b7 f6 05 34 f8 00 e2 20 74 b7 80 3d f0 b0 00 00 00 74 ae 48 89 de 48 c7 c7 20 16 3e a0 31 c0 e8 05 ca 13 e1 eb 9b <0f> 0b eb fe f6 05 0b f8 00 e2 20 0f 84 db fe ff ff 80 3d eb b0 
<1>[67096.760255] RIP  [<ffffffffa03d99ac>] destroy_conntrack+0x15c/0x190 [nf_conntrack]
<4>[67096.760255]  RSP <ffff88001ae378b8>
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrei Vagin Jan. 14, 2014, 11:10 a.m. UTC | #3
>
> Eh, looks like this path is incomplete too:(
>
> I think we can't set a reference counter for objects which is allocated
> from a SLAB_DESTROY_BY_RCU cache. Look at the following backtrace.
>
> cpu1                                    cpu2
> ct = ____nf_conntrack_find()
>                                         destroy_conntrack
> atomic_inc_not_zero(ct)

ct->ct_general.use is zero after destroy_conntrack(). Sorry for the noise.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Jan. 14, 2014, 2:36 p.m. UTC | #4
On Tue, 2014-01-14 at 14:51 +0400, Andrew Vagin wrote:

> I think __nf_conntrack_alloc must use atomic_inc instead of
> atomic_set. And we must be sure, that the first object from a new page is
> zeroized.

No you can not do that, and we do not need.

If a new page is allocated, then you have the guarantee nobody can ever
uses it, its content can be totally random.

Only 'living' objects, the ones that were previously inserted in the
hash table, can be found, and their refcnt must be accurate.

A freed object has refcnt == 0, thats the golden rule.

When the page is freed (after RCU grace period), nobody cares of refcnt
anymore.



--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso Jan. 29, 2014, 7:21 p.m. UTC | #5
On Sun, Jan 12, 2014 at 12:21:14PM -0800, Eric Dumazet wrote:
> On Sun, 2014-01-12 at 21:50 +0400, Andrey Vagin wrote:
> > Lets look at destroy_conntrack:
> > 
> > hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
> > ...
> > nf_conntrack_free(ct)
> > 	kmem_cache_free(net->ct.nf_conntrack_cachep, ct);
> > 
> > net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.
> > 
> > The hash is protected by rcu, so readers look up conntracks without
> > locks.
> > A conntrack is removed from the hash, but in this moment a few readers
> > still can use the conntrack. Then this conntrack is released and another
> > thread creates conntrack with the same address and the equal tuple.
> > After this a reader starts to validate the conntrack:
> > * It's not dying, because a new conntrack was created
> > * nf_ct_tuple_equal() returns true.
> ...
> 
> 
> > v2: move nf_ct_is_confirmed into the unlikely() annotation
> > v3: Eric suggested to fix refcnt, so that it becomes zero before adding
> >     in a hash, but we can't find a way how to do that. Another way is to
> >     interpret the confirm bit as part of a search key and check it in
> >     ____nf_conntrack_find() too.
> > 
> > Cc: Eric Dumazet <eric.dumazet@gmail.com>
> > Cc: Florian Westphal <fw@strlen.de>
> > Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> > Cc: Patrick McHardy <kaber@trash.net>
> > Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
> > Cc: "David S. Miller" <davem@davemloft.net>
> > Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> > Signed-off-by: Andrey Vagin <avagin@openvz.org>
> > ---
> 
> Acked-by: Eric Dumazet <edumazet@google.com>

Applied, thanks everyone!
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 43549eb..af6ad2e 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -318,6 +318,21 @@  static void death_by_timeout(unsigned long ul_conntrack)
 	nf_ct_delete((struct nf_conn *)ul_conntrack, 0, 0);
 }
 
+static inline bool
+nf_ct_key_equal(struct nf_conntrack_tuple_hash *h,
+			const struct nf_conntrack_tuple *tuple,
+			u16 zone)
+{
+	struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
+
+	/* A conntrack can be recreated with the equal tuple,
+	 * so we need to check that the conntrack is confirmed
+	 */
+	return nf_ct_tuple_equal(tuple, &h->tuple) &&
+		nf_ct_zone(ct) == zone &&
+		nf_ct_is_confirmed(ct);
+}
+
 /*
  * Warning :
  * - Caller must take a reference on returned object
@@ -339,8 +354,7 @@  ____nf_conntrack_find(struct net *net, u16 zone,
 	local_bh_disable();
 begin:
 	hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[bucket], hnnode) {
-		if (nf_ct_tuple_equal(tuple, &h->tuple) &&
-		    nf_ct_zone(nf_ct_tuplehash_to_ctrack(h)) == zone) {
+		if (nf_ct_key_equal(h, tuple, zone)) {
 			NF_CT_STAT_INC(net, found);
 			local_bh_enable();
 			return h;
@@ -387,8 +401,7 @@  begin:
 			     !atomic_inc_not_zero(&ct->ct_general.use)))
 			h = NULL;
 		else {
-			if (unlikely(!nf_ct_tuple_equal(tuple, &h->tuple) ||
-				     nf_ct_zone(ct) != zone)) {
+			if (unlikely(!nf_ct_key_equal(h, tuple, zone))) {
 				nf_ct_put(ct);
 				goto begin;
 			}