diff mbox

[1/1] commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.

Message ID alpine.OSX.2.20.1510261144390.3919@athabasca.local
State Awaiting Upstream, archived
Delegated to: David Miller
Headers show

Commit Message

Ani Sinha Oct. 26, 2015, 6:55 p.m. UTC
netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get

Lets look at destroy_conntrack:

hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
...
nf_conntrack_free(ct)
	kmem_cache_free(net->ct.nf_conntrack_cachep, ct);

net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.

The hash is protected by rcu, so readers look up conntracks without
locks.
A conntrack is removed from the hash, but in this moment a few readers
still can use the conntrack. Then this conntrack is released and another
thread creates conntrack with the same address and the equal tuple.
After this a reader starts to validate the conntrack:
* It's not dying, because a new conntrack was created
* nf_ct_tuple_equal() returns true.

But this conntrack is not initialized yet, so it can not be used by two
threads concurrently. In this case BUG_ON may be triggered from
nf_nat_setup_info().

Florian Westphal suggested to check the confirm bit too. I think it's
right.

task 1			task 2			task 3
			nf_conntrack_find_get
			 ____nf_conntrack_find
destroy_conntrack
 hlist_nulls_del_rcu
 nf_conntrack_free
 kmem_cache_free
						__nf_conntrack_alloc
						 kmem_cache_alloc
						 memset(&ct->tuplehash[IP_CT_DIR_MAX],
			 if (nf_ct_is_dying(ct))
			 if (!nf_ct_tuple_equal()

I'm not sure, that I have ever seen this race condition in a real life.
Currently we are investigating a bug, which is reproduced on a few nodes.
In our case one conntrack is initialized from a few tasks concurrently,
we don't have any other explanation for this.

<2>[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322!
...
<4>[46267.083951] RIP: 0010:[<ffffffffa01e00a4>]  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590 [nf_nat]
...
<4>[46267.085549] Call Trace:
<4>[46267.085622]  [<ffffffffa023421b>] alloc_null_binding+0x5b/0xa0 [iptable_nat]
<4>[46267.085697]  [<ffffffffa02342bc>] nf_nat_rule_find+0x5c/0x80 [iptable_nat]
<4>[46267.085770]  [<ffffffffa0234521>] nf_nat_fn+0x111/0x260 [iptable_nat]
<4>[46267.085843]  [<ffffffffa0234798>] nf_nat_out+0x48/0xd0 [iptable_nat]
<4>[46267.085919]  [<ffffffff814841b9>] nf_iterate+0x69/0xb0
<4>[46267.085991]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086063]  [<ffffffff81484374>] nf_hook_slow+0x74/0x110
<4>[46267.086133]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086207]  [<ffffffff814b5890>] ? dst_output+0x0/0x20
<4>[46267.086277]  [<ffffffff81495204>] ip_output+0xa4/0xc0
<4>[46267.086346]  [<ffffffff814b65a4>] raw_sendmsg+0x8b4/0x910
<4>[46267.086419]  [<ffffffff814c10fa>] inet_sendmsg+0x4a/0xb0
<4>[46267.086491]  [<ffffffff814459aa>] ? sock_update_classid+0x3a/0x50
<4>[46267.086562]  [<ffffffff81444d67>] sock_sendmsg+0x117/0x140
<4>[46267.086638]  [<ffffffff8151997b>] ? _spin_unlock_bh+0x1b/0x20
<4>[46267.086712]  [<ffffffff8109d370>] ? autoremove_wake_function+0x0/0x40
<4>[46267.086785]  [<ffffffff81495e80>] ? do_ip_setsockopt+0x90/0xd80
<4>[46267.086858]  [<ffffffff8100be0e>] ? call_function_interrupt+0xe/0x20
<4>[46267.086936]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087006]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087081]  [<ffffffff8118f2e8>] ? kmem_cache_alloc+0xd8/0x1e0
<4>[46267.087151]  [<ffffffff81445599>] sys_sendto+0x139/0x190
<4>[46267.087229]  [<ffffffff81448c0d>] ? sock_setsockopt+0x16d/0x6f0
<4>[46267.087303]  [<ffffffff810efa47>] ? audit_syscall_entry+0x1d7/0x200
<4>[46267.087378]  [<ffffffff810ef795>] ? __audit_syscall_exit+0x265/0x290
<4>[46267.087454]  [<ffffffff81474885>] ? compat_sys_setsockopt+0x75/0x210
<4>[46267.087531]  [<ffffffff81474b5f>] compat_sys_socketcall+0x13f/0x210
<4>[46267.087607]  [<ffffffff8104dea3>] ia32_sysret+0x0/0x5
<4>[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 <0f> 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74
<1>[46267.088023] RIP  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Ani Sinha <ani@arista.com>
---
 net/netfilter/nf_conntrack_core.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

Comments

Pablo Neira Ayuso Oct. 26, 2015, 8:06 p.m. UTC | #1
Hi,

On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote:
> netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get

Please, no need to Cc everyone here. Please, submit your Netfilter
patches to netfilter-devel@vger.kernel.org.

Moreover, it would be great if the subject includes something
descriptive on what you need, for this I'd suggest:

[PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get

I'm including Neal P. Murphy, he said he would help testing these
backports, getting a Tested-by: tag usually speeds up things too.

Burden is usually huge here, the easier you get it for us, the best.
Then we can review and, if no major concerns, I can submit this to
-stable.

Let me know if you have any other questions,
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neal P. Murphy Oct. 28, 2015, 6:36 a.m. UTC | #2
On Mon, 26 Oct 2015 21:06:33 +0100
Pablo Neira Ayuso <pablo@netfilter.org> wrote:

> Hi,
> 
> On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote:
> > netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
> 
> Please, no need to Cc everyone here. Please, submit your Netfilter
> patches to netfilter-devel@vger.kernel.org.
> 
> Moreover, it would be great if the subject includes something
> descriptive on what you need, for this I'd suggest:
> 
> [PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
> 
> I'm including Neal P. Murphy, he said he would help testing these
> backports, getting a Tested-by: tag usually speeds up things too.

I hammered it a couple nights ago. First test was 5000 processes on 6 SMP CPUs opening and closing a port on a 'remote' host using the usual random source ports. Only got up to 32000 conntracks. The generator was a 64-bit Smoothwall KVM without the patch. The traffic passed through a 32-bit Smoothwall KVM with the patch. The target was on the VM host. No problems encountered. I suspect I didn't come close to triggering the original problem. Second test was a couple thousand processes all using the same source IP and port and dest IP and port. Still no problems. But these were perl scripts (and they used lots of RAM); perhaps a short C program would let me run more.

Any ideas on how I might test it more brutally?

N
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neal P. Murphy Oct. 29, 2015, 6:40 a.m. UTC | #3
On Wed, 28 Oct 2015 02:36:50 -0400
"Neal P. Murphy" <neal.p.murphy@alum.wpi.edu> wrote:

> On Mon, 26 Oct 2015 21:06:33 +0100
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> 
> > Hi,
> > 
> > On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote:
> > > netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
> > 
> > Please, no need to Cc everyone here. Please, submit your Netfilter
> > patches to netfilter-devel@vger.kernel.org.
> > 
> > Moreover, it would be great if the subject includes something
> > descriptive on what you need, for this I'd suggest:
> > 
> > [PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
> > 
> > I'm including Neal P. Murphy, he said he would help testing these
> > backports, getting a Tested-by: tag usually speeds up things too.
> 

I've probably done about as much seat-of-the-pants testing as I can. All opening/closing the same destination IP/port.

Host: Debian Jessie, 8-core Vishera 8350 at 4.4 GHz, 16GiB RAM at (I think) 2100MHz.

Traffic generator 1: 6-CPU KVM running 64-bit Smoothwall Express 3.1 (linux 3.4.109 without these patches), with 8GiB RAM and 9GiB swap. Packets sent across PURPLE (to bypass NAT and firewall).

Traffic generator 2: 32-bit KVM running Smoothwall Express 3.1 (linux 3.4.110 with these patches), 3GiB RAM and minimal swap.

In the first set of tests, generator 1's traffic passed through Generator 2 as a NATting firewall, to the host's web server. In the second set of tests, generator 2's traffic went through NAT to the host's web server.

The load tests:
  - 2500 processes using 2500 addresses and random src ports
  - 2500 processes using 2500 addresses and the same src port
  - 2500 processes using the same src address and port

I also tested using stock NF timeouts and using 1 second timeouts.

Bandwidth used got as high as 16Mb/s for some tests.

Conntracks got up to 200 000 or so or bounced between 1 and 2, depending on the test and the timeouts.

I did not reproduce the problem these patches solve. But more importantly, I saw no problems at all. Each time I terminated a test, RAM usage returned to about that of post-boot; so there were no apparent memory leaks. No kernel messages and no netfilter messages appeared during the tests.

If I have time, I suppose I could run another set of tests: 2500 source processes using 2500 addresses times 200 ports to connect to 2500 addresses times 200 ports on a destination system. Each process opens 200 sockets, then closes them. And repeats ad infinitum. But I might have to be clever since I can't run 500 000 processes; but I could run 20 VMs; that would get it down to about 12 000 processes per VM. And I might have to figure out how to allow allow processes on the destination system to open hundreds or thousands of sockets.

N
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ani Sinha Oct. 30, 2015, 12:01 a.m. UTC | #4
On Wed, Oct 28, 2015 at 11:40 PM, Neal P. Murphy
<neal.p.murphy@alum.wpi.edu> wrote:
> On Wed, 28 Oct 2015 02:36:50 -0400
> "Neal P. Murphy" <neal.p.murphy@alum.wpi.edu> wrote:
>
>> On Mon, 26 Oct 2015 21:06:33 +0100
>> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
>>
>> > Hi,
>> >
>> > On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote:
>> > > netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
>> >
>> > Please, no need to Cc everyone here. Please, submit your Netfilter
>> > patches to netfilter-devel@vger.kernel.org.
>> >
>> > Moreover, it would be great if the subject includes something
>> > descriptive on what you need, for this I'd suggest:
>> >
>> > [PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
>> >
>> > I'm including Neal P. Murphy, he said he would help testing these
>> > backports, getting a Tested-by: tag usually speeds up things too.
>>
>
> I've probably done about as much seat-of-the-pants testing as I can. All opening/closing the same destination IP/port.
>
> Host: Debian Jessie, 8-core Vishera 8350 at 4.4 GHz, 16GiB RAM at (I think) 2100MHz.
>
> Traffic generator 1: 6-CPU KVM running 64-bit Smoothwall Express 3.1 (linux 3.4.109 without these patches), with 8GiB RAM and 9GiB swap. Packets sent across PURPLE (to bypass NAT and firewall).
>
> Traffic generator 2: 32-bit KVM running Smoothwall Express 3.1 (linux 3.4.110 with these patches), 3GiB RAM and minimal swap.
>
> In the first set of tests, generator 1's traffic passed through Generator 2 as a NATting firewall, to the host's web server. In the second set of tests, generator 2's traffic went through NAT to the host's web server.
>
> The load tests:
>   - 2500 processes using 2500 addresses and random src ports
>   - 2500 processes using 2500 addresses and the same src port
>   - 2500 processes using the same src address and port
>
> I also tested using stock NF timeouts and using 1 second timeouts.
>
> Bandwidth used got as high as 16Mb/s for some tests.
>
> Conntracks got up to 200 000 or so or bounced between 1 and 2, depending on the test and the timeouts.
>
> I did not reproduce the problem these patches solve. But more importantly, I saw no problems at all. Each time I terminated a test, RAM usage returned to about that of post-boot; so there were no apparent memory leaks. No kernel messages and no netfilter messages appeared during the tests.
>
> If I have time, I suppose I could run another set of tests: 2500 source processes using 2500 addresses times 200 ports to connect to 2500 addresses times 200 ports on a destination system. Each process opens 200 sockets, then closes them. And repeats ad infinitum. But I might have to be clever since I can't run 500 000 processes; but I could run 20 VMs; that would get it down to about 12 000 processes per VM. And I might have to figure out how to allow allow processes on the destination system to open hundreds or thousands of sockets.

Should I resend the patch with a Tested-by: tag?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neal P. Murphy Oct. 30, 2015, 1:21 a.m. UTC | #5
On Thu, 29 Oct 2015 17:01:24 -0700
Ani Sinha <ani@arista.com> wrote:

> On Wed, Oct 28, 2015 at 11:40 PM, Neal P. Murphy
> <neal.p.murphy@alum.wpi.edu> wrote:
> > On Wed, 28 Oct 2015 02:36:50 -0400
> > "Neal P. Murphy" <neal.p.murphy@alum.wpi.edu> wrote:
> >
> >> On Mon, 26 Oct 2015 21:06:33 +0100
> >> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> >>
> >> > Hi,
> >> >
> >> > On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote:
> >> > > netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
> >> >
> >> > Please, no need to Cc everyone here. Please, submit your Netfilter
> >> > patches to netfilter-devel@vger.kernel.org.
> >> >
> >> > Moreover, it would be great if the subject includes something
> >> > descriptive on what you need, for this I'd suggest:
> >> >
> >> > [PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
> >> >
> >> > I'm including Neal P. Murphy, he said he would help testing these
> >> > backports, getting a Tested-by: tag usually speeds up things too.
> >>
> >
> > I've probably done about as much seat-of-the-pants testing as I can. All opening/closing the same destination IP/port.
> >
> > Host: Debian Jessie, 8-core Vishera 8350 at 4.4 GHz, 16GiB RAM at (I think) 2100MHz.
> >
> > Traffic generator 1: 6-CPU KVM running 64-bit Smoothwall Express 3.1 (linux 3.4.109 without these patches), with 8GiB RAM and 9GiB swap. Packets sent across PURPLE (to bypass NAT and firewall).
> >
> > Traffic generator 2: 32-bit KVM running Smoothwall Express 3.1 (linux 3.4.110 with these patches), 3GiB RAM and minimal swap.
> >
> > In the first set of tests, generator 1's traffic passed through Generator 2 as a NATting firewall, to the host's web server. In the second set of tests, generator 2's traffic went through NAT to the host's web server.
> >
> > The load tests:
> >   - 2500 processes using 2500 addresses and random src ports
> >   - 2500 processes using 2500 addresses and the same src port
> >   - 2500 processes using the same src address and port
> >
> > I also tested using stock NF timeouts and using 1 second timeouts.
> >
> > Bandwidth used got as high as 16Mb/s for some tests.
> >
> > Conntracks got up to 200 000 or so or bounced between 1 and 2, depending on the test and the timeouts.
> >
> > I did not reproduce the problem these patches solve. But more importantly, I saw no problems at all. Each time I terminated a test, RAM usage returned to about that of post-boot; so there were no apparent memory leaks. No kernel messages and no netfilter messages appeared during the tests.
> >
> > If I have time, I suppose I could run another set of tests: 2500 source processes using 2500 addresses times 200 ports to connect to 2500 addresses times 200 ports on a destination system. Each process opens 200 sockets, then closes them. And repeats ad infinitum. But I might have to be clever since I can't run 500 000 processes; but I could run 20 VMs; that would get it down to about 12 000 processes per VM. And I might have to figure out how to allow allow processes on the destination system to open hundreds or thousands of sockets.
> 
> Should I resend the patch with a Tested-by: tag?

... Oh, wait. Not yet. The dawn just broke over ol' Marblehead here. I only tested TCP; I need to hammer UDP, too.

Can I set the timeouts to zero? Or is one as low as I can go?

N
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ani Sinha Oct. 30, 2015, 3:37 p.m. UTC | #6
On Thu, Oct 29, 2015 at 6:21 PM, Neal P. Murphy
<neal.p.murphy@alum.wpi.edu> wrote:
> On Thu, 29 Oct 2015 17:01:24 -0700
> Ani Sinha <ani@arista.com> wrote:
>
>> On Wed, Oct 28, 2015 at 11:40 PM, Neal P. Murphy
>> <neal.p.murphy@alum.wpi.edu> wrote:
>> > On Wed, 28 Oct 2015 02:36:50 -0400
>> > "Neal P. Murphy" <neal.p.murphy@alum.wpi.edu> wrote:
>> >
>> >> On Mon, 26 Oct 2015 21:06:33 +0100
>> >> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
>> >>
>> >> > Hi,
>> >> >
>> >> > On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote:
>> >> > > netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
>> >> >
>> >> > Please, no need to Cc everyone here. Please, submit your Netfilter
>> >> > patches to netfilter-devel@vger.kernel.org.
>> >> >
>> >> > Moreover, it would be great if the subject includes something
>> >> > descriptive on what you need, for this I'd suggest:
>> >> >
>> >> > [PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
>> >> >
>> >> > I'm including Neal P. Murphy, he said he would help testing these
>> >> > backports, getting a Tested-by: tag usually speeds up things too.
>> >>
>> >
>> > I've probably done about as much seat-of-the-pants testing as I can. All opening/closing the same destination IP/port.
>> >
>> > Host: Debian Jessie, 8-core Vishera 8350 at 4.4 GHz, 16GiB RAM at (I think) 2100MHz.
>> >
>> > Traffic generator 1: 6-CPU KVM running 64-bit Smoothwall Express 3.1 (linux 3.4.109 without these patches), with 8GiB RAM and 9GiB swap. Packets sent across PURPLE (to bypass NAT and firewall).
>> >
>> > Traffic generator 2: 32-bit KVM running Smoothwall Express 3.1 (linux 3.4.110 with these patches), 3GiB RAM and minimal swap.
>> >
>> > In the first set of tests, generator 1's traffic passed through Generator 2 as a NATting firewall, to the host's web server. In the second set of tests, generator 2's traffic went through NAT to the host's web server.
>> >
>> > The load tests:
>> >   - 2500 processes using 2500 addresses and random src ports
>> >   - 2500 processes using 2500 addresses and the same src port
>> >   - 2500 processes using the same src address and port
>> >
>> > I also tested using stock NF timeouts and using 1 second timeouts.
>> >
>> > Bandwidth used got as high as 16Mb/s for some tests.
>> >
>> > Conntracks got up to 200 000 or so or bounced between 1 and 2, depending on the test and the timeouts.
>> >
>> > I did not reproduce the problem these patches solve. But more importantly, I saw no problems at all. Each time I terminated a test, RAM usage returned to about that of post-boot; so there were no apparent memory leaks. No kernel messages and no netfilter messages appeared during the tests.
>> >
>> > If I have time, I suppose I could run another set of tests: 2500 source processes using 2500 addresses times 200 ports to connect to 2500 addresses times 200 ports on a destination system. Each process opens 200 sockets, then closes them. And repeats ad infinitum. But I might have to be clever since I can't run 500 000 processes; but I could run 20 VMs; that would get it down to about 12 000 processes per VM. And I might have to figure out how to allow allow processes on the destination system to open hundreds or thousands of sockets.
>>
>> Should I resend the patch with a Tested-by: tag?
>
> ... Oh, wait. Not yet. The dawn just broke over ol' Marblehead here. I only tested TCP; I need to hammer UDP, too.
>
> Can I set the timeouts to zero? Or is one as low as I can go?

I don't see any assertion or check against 0 sec timeouts. You can
try. Your conntrack entries will be constantly flushing.

>
> N
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ani Sinha Nov. 4, 2015, 11:46 p.m. UTC | #7
(removed a bunch of people from CC list)

On Mon, Oct 26, 2015 at 1:06 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:

> Then we can review and, if no major concerns, I can submit this to
> -stable.

Now that Neal has sufficiently tested the patches, is it OK to apply
to -stable or do you guys want me to do anything more?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso Nov. 6, 2015, 11:09 a.m. UTC | #8
On Wed, Nov 04, 2015 at 03:46:54PM -0800, Ani Sinha wrote:
> (removed a bunch of people from CC list)
> 
> On Mon, Oct 26, 2015 at 1:06 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> 
> > Then we can review and, if no major concerns, I can submit this to
> > -stable.
> 
> Now that Neal has sufficiently tested the patches, is it OK to apply
> to -stable or do you guys want me to do anything more?

I'll be passing up this to -stable asap.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 9a46908..fd0f7a3 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -309,6 +309,21 @@  static void death_by_timeout(unsigned long ul_conntrack)
 	nf_ct_put(ct);
 }
 
+static inline bool
+nf_ct_key_equal(struct nf_conntrack_tuple_hash *h,
+			const struct nf_conntrack_tuple *tuple,
+			u16 zone)
+{
+	struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
+
+	/* A conntrack can be recreated with the equal tuple,
+	 * so we need to check that the conntrack is confirmed
+	 */
+	return nf_ct_tuple_equal(tuple, &h->tuple) &&
+		nf_ct_zone(ct) == zone &&
+		nf_ct_is_confirmed(ct);
+}
+
 /*
  * Warning :
  * - Caller must take a reference on returned object
@@ -330,8 +345,7 @@  ____nf_conntrack_find(struct net *net, u16 zone,
 	local_bh_disable();
 begin:
 	hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[bucket], hnnode) {
-		if (nf_ct_tuple_equal(tuple, &h->tuple) &&
-		    nf_ct_zone(nf_ct_tuplehash_to_ctrack(h)) == zone) {
+		if (nf_ct_key_equal(h, tuple, zone)) {
 			NF_CT_STAT_INC(net, found);
 			local_bh_enable();
 			return h;
@@ -378,8 +392,7 @@  begin:
 			     !atomic_inc_not_zero(&ct->ct_general.use)))
 			h = NULL;
 		else {
-			if (unlikely(!nf_ct_tuple_equal(tuple, &h->tuple) ||
-				     nf_ct_zone(ct) != zone)) {
+			if (unlikely(!nf_ct_key_equal(h, tuple, zone))) {
 				nf_ct_put(ct);
 				goto begin;
 			}