Message ID | alpine.OSX.2.20.1510261144390.3919@athabasca.local |
---|---|
State | Awaiting Upstream, archived |
Delegated to: | David Miller |
Headers | show |
Hi,
On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote:
> netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
Please, no need to Cc everyone here. Please, submit your Netfilter
patches to netfilter-devel@vger.kernel.org.
Moreover, it would be great if the subject includes something
descriptive on what you need, for this I'd suggest:
[PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
I'm including Neal P. Murphy, he said he would help testing these
backports, getting a Tested-by: tag usually speeds up things too.
Burden is usually huge here, the easier you get it for us, the best.
Then we can review and, if no major concerns, I can submit this to
-stable.
Let me know if you have any other questions,
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 26 Oct 2015 21:06:33 +0100 Pablo Neira Ayuso <pablo@netfilter.org> wrote: > Hi, > > On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote: > > netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get > > Please, no need to Cc everyone here. Please, submit your Netfilter > patches to netfilter-devel@vger.kernel.org. > > Moreover, it would be great if the subject includes something > descriptive on what you need, for this I'd suggest: > > [PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get > > I'm including Neal P. Murphy, he said he would help testing these > backports, getting a Tested-by: tag usually speeds up things too. I hammered it a couple nights ago. First test was 5000 processes on 6 SMP CPUs opening and closing a port on a 'remote' host using the usual random source ports. Only got up to 32000 conntracks. The generator was a 64-bit Smoothwall KVM without the patch. The traffic passed through a 32-bit Smoothwall KVM with the patch. The target was on the VM host. No problems encountered. I suspect I didn't come close to triggering the original problem. Second test was a couple thousand processes all using the same source IP and port and dest IP and port. Still no problems. But these were perl scripts (and they used lots of RAM); perhaps a short C program would let me run more. Any ideas on how I might test it more brutally? N -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 28 Oct 2015 02:36:50 -0400 "Neal P. Murphy" <neal.p.murphy@alum.wpi.edu> wrote: > On Mon, 26 Oct 2015 21:06:33 +0100 > Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > Hi, > > > > On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote: > > > netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get > > > > Please, no need to Cc everyone here. Please, submit your Netfilter > > patches to netfilter-devel@vger.kernel.org. > > > > Moreover, it would be great if the subject includes something > > descriptive on what you need, for this I'd suggest: > > > > [PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get > > > > I'm including Neal P. Murphy, he said he would help testing these > > backports, getting a Tested-by: tag usually speeds up things too. > I've probably done about as much seat-of-the-pants testing as I can. All opening/closing the same destination IP/port. Host: Debian Jessie, 8-core Vishera 8350 at 4.4 GHz, 16GiB RAM at (I think) 2100MHz. Traffic generator 1: 6-CPU KVM running 64-bit Smoothwall Express 3.1 (linux 3.4.109 without these patches), with 8GiB RAM and 9GiB swap. Packets sent across PURPLE (to bypass NAT and firewall). Traffic generator 2: 32-bit KVM running Smoothwall Express 3.1 (linux 3.4.110 with these patches), 3GiB RAM and minimal swap. In the first set of tests, generator 1's traffic passed through Generator 2 as a NATting firewall, to the host's web server. In the second set of tests, generator 2's traffic went through NAT to the host's web server. The load tests: - 2500 processes using 2500 addresses and random src ports - 2500 processes using 2500 addresses and the same src port - 2500 processes using the same src address and port I also tested using stock NF timeouts and using 1 second timeouts. Bandwidth used got as high as 16Mb/s for some tests. Conntracks got up to 200 000 or so or bounced between 1 and 2, depending on the test and the timeouts. I did not reproduce the problem these patches solve. But more importantly, I saw no problems at all. Each time I terminated a test, RAM usage returned to about that of post-boot; so there were no apparent memory leaks. No kernel messages and no netfilter messages appeared during the tests. If I have time, I suppose I could run another set of tests: 2500 source processes using 2500 addresses times 200 ports to connect to 2500 addresses times 200 ports on a destination system. Each process opens 200 sockets, then closes them. And repeats ad infinitum. But I might have to be clever since I can't run 500 000 processes; but I could run 20 VMs; that would get it down to about 12 000 processes per VM. And I might have to figure out how to allow allow processes on the destination system to open hundreds or thousands of sockets. N -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 28, 2015 at 11:40 PM, Neal P. Murphy <neal.p.murphy@alum.wpi.edu> wrote: > On Wed, 28 Oct 2015 02:36:50 -0400 > "Neal P. Murphy" <neal.p.murphy@alum.wpi.edu> wrote: > >> On Mon, 26 Oct 2015 21:06:33 +0100 >> Pablo Neira Ayuso <pablo@netfilter.org> wrote: >> >> > Hi, >> > >> > On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote: >> > > netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get >> > >> > Please, no need to Cc everyone here. Please, submit your Netfilter >> > patches to netfilter-devel@vger.kernel.org. >> > >> > Moreover, it would be great if the subject includes something >> > descriptive on what you need, for this I'd suggest: >> > >> > [PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get >> > >> > I'm including Neal P. Murphy, he said he would help testing these >> > backports, getting a Tested-by: tag usually speeds up things too. >> > > I've probably done about as much seat-of-the-pants testing as I can. All opening/closing the same destination IP/port. > > Host: Debian Jessie, 8-core Vishera 8350 at 4.4 GHz, 16GiB RAM at (I think) 2100MHz. > > Traffic generator 1: 6-CPU KVM running 64-bit Smoothwall Express 3.1 (linux 3.4.109 without these patches), with 8GiB RAM and 9GiB swap. Packets sent across PURPLE (to bypass NAT and firewall). > > Traffic generator 2: 32-bit KVM running Smoothwall Express 3.1 (linux 3.4.110 with these patches), 3GiB RAM and minimal swap. > > In the first set of tests, generator 1's traffic passed through Generator 2 as a NATting firewall, to the host's web server. In the second set of tests, generator 2's traffic went through NAT to the host's web server. > > The load tests: > - 2500 processes using 2500 addresses and random src ports > - 2500 processes using 2500 addresses and the same src port > - 2500 processes using the same src address and port > > I also tested using stock NF timeouts and using 1 second timeouts. > > Bandwidth used got as high as 16Mb/s for some tests. > > Conntracks got up to 200 000 or so or bounced between 1 and 2, depending on the test and the timeouts. > > I did not reproduce the problem these patches solve. But more importantly, I saw no problems at all. Each time I terminated a test, RAM usage returned to about that of post-boot; so there were no apparent memory leaks. No kernel messages and no netfilter messages appeared during the tests. > > If I have time, I suppose I could run another set of tests: 2500 source processes using 2500 addresses times 200 ports to connect to 2500 addresses times 200 ports on a destination system. Each process opens 200 sockets, then closes them. And repeats ad infinitum. But I might have to be clever since I can't run 500 000 processes; but I could run 20 VMs; that would get it down to about 12 000 processes per VM. And I might have to figure out how to allow allow processes on the destination system to open hundreds or thousands of sockets. Should I resend the patch with a Tested-by: tag? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 29 Oct 2015 17:01:24 -0700 Ani Sinha <ani@arista.com> wrote: > On Wed, Oct 28, 2015 at 11:40 PM, Neal P. Murphy > <neal.p.murphy@alum.wpi.edu> wrote: > > On Wed, 28 Oct 2015 02:36:50 -0400 > > "Neal P. Murphy" <neal.p.murphy@alum.wpi.edu> wrote: > > > >> On Mon, 26 Oct 2015 21:06:33 +0100 > >> Pablo Neira Ayuso <pablo@netfilter.org> wrote: > >> > >> > Hi, > >> > > >> > On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote: > >> > > netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get > >> > > >> > Please, no need to Cc everyone here. Please, submit your Netfilter > >> > patches to netfilter-devel@vger.kernel.org. > >> > > >> > Moreover, it would be great if the subject includes something > >> > descriptive on what you need, for this I'd suggest: > >> > > >> > [PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get > >> > > >> > I'm including Neal P. Murphy, he said he would help testing these > >> > backports, getting a Tested-by: tag usually speeds up things too. > >> > > > > I've probably done about as much seat-of-the-pants testing as I can. All opening/closing the same destination IP/port. > > > > Host: Debian Jessie, 8-core Vishera 8350 at 4.4 GHz, 16GiB RAM at (I think) 2100MHz. > > > > Traffic generator 1: 6-CPU KVM running 64-bit Smoothwall Express 3.1 (linux 3.4.109 without these patches), with 8GiB RAM and 9GiB swap. Packets sent across PURPLE (to bypass NAT and firewall). > > > > Traffic generator 2: 32-bit KVM running Smoothwall Express 3.1 (linux 3.4.110 with these patches), 3GiB RAM and minimal swap. > > > > In the first set of tests, generator 1's traffic passed through Generator 2 as a NATting firewall, to the host's web server. In the second set of tests, generator 2's traffic went through NAT to the host's web server. > > > > The load tests: > > - 2500 processes using 2500 addresses and random src ports > > - 2500 processes using 2500 addresses and the same src port > > - 2500 processes using the same src address and port > > > > I also tested using stock NF timeouts and using 1 second timeouts. > > > > Bandwidth used got as high as 16Mb/s for some tests. > > > > Conntracks got up to 200 000 or so or bounced between 1 and 2, depending on the test and the timeouts. > > > > I did not reproduce the problem these patches solve. But more importantly, I saw no problems at all. Each time I terminated a test, RAM usage returned to about that of post-boot; so there were no apparent memory leaks. No kernel messages and no netfilter messages appeared during the tests. > > > > If I have time, I suppose I could run another set of tests: 2500 source processes using 2500 addresses times 200 ports to connect to 2500 addresses times 200 ports on a destination system. Each process opens 200 sockets, then closes them. And repeats ad infinitum. But I might have to be clever since I can't run 500 000 processes; but I could run 20 VMs; that would get it down to about 12 000 processes per VM. And I might have to figure out how to allow allow processes on the destination system to open hundreds or thousands of sockets. > > Should I resend the patch with a Tested-by: tag? ... Oh, wait. Not yet. The dawn just broke over ol' Marblehead here. I only tested TCP; I need to hammer UDP, too. Can I set the timeouts to zero? Or is one as low as I can go? N -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 29, 2015 at 6:21 PM, Neal P. Murphy <neal.p.murphy@alum.wpi.edu> wrote: > On Thu, 29 Oct 2015 17:01:24 -0700 > Ani Sinha <ani@arista.com> wrote: > >> On Wed, Oct 28, 2015 at 11:40 PM, Neal P. Murphy >> <neal.p.murphy@alum.wpi.edu> wrote: >> > On Wed, 28 Oct 2015 02:36:50 -0400 >> > "Neal P. Murphy" <neal.p.murphy@alum.wpi.edu> wrote: >> > >> >> On Mon, 26 Oct 2015 21:06:33 +0100 >> >> Pablo Neira Ayuso <pablo@netfilter.org> wrote: >> >> >> >> > Hi, >> >> > >> >> > On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote: >> >> > > netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get >> >> > >> >> > Please, no need to Cc everyone here. Please, submit your Netfilter >> >> > patches to netfilter-devel@vger.kernel.org. >> >> > >> >> > Moreover, it would be great if the subject includes something >> >> > descriptive on what you need, for this I'd suggest: >> >> > >> >> > [PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get >> >> > >> >> > I'm including Neal P. Murphy, he said he would help testing these >> >> > backports, getting a Tested-by: tag usually speeds up things too. >> >> >> > >> > I've probably done about as much seat-of-the-pants testing as I can. All opening/closing the same destination IP/port. >> > >> > Host: Debian Jessie, 8-core Vishera 8350 at 4.4 GHz, 16GiB RAM at (I think) 2100MHz. >> > >> > Traffic generator 1: 6-CPU KVM running 64-bit Smoothwall Express 3.1 (linux 3.4.109 without these patches), with 8GiB RAM and 9GiB swap. Packets sent across PURPLE (to bypass NAT and firewall). >> > >> > Traffic generator 2: 32-bit KVM running Smoothwall Express 3.1 (linux 3.4.110 with these patches), 3GiB RAM and minimal swap. >> > >> > In the first set of tests, generator 1's traffic passed through Generator 2 as a NATting firewall, to the host's web server. In the second set of tests, generator 2's traffic went through NAT to the host's web server. >> > >> > The load tests: >> > - 2500 processes using 2500 addresses and random src ports >> > - 2500 processes using 2500 addresses and the same src port >> > - 2500 processes using the same src address and port >> > >> > I also tested using stock NF timeouts and using 1 second timeouts. >> > >> > Bandwidth used got as high as 16Mb/s for some tests. >> > >> > Conntracks got up to 200 000 or so or bounced between 1 and 2, depending on the test and the timeouts. >> > >> > I did not reproduce the problem these patches solve. But more importantly, I saw no problems at all. Each time I terminated a test, RAM usage returned to about that of post-boot; so there were no apparent memory leaks. No kernel messages and no netfilter messages appeared during the tests. >> > >> > If I have time, I suppose I could run another set of tests: 2500 source processes using 2500 addresses times 200 ports to connect to 2500 addresses times 200 ports on a destination system. Each process opens 200 sockets, then closes them. And repeats ad infinitum. But I might have to be clever since I can't run 500 000 processes; but I could run 20 VMs; that would get it down to about 12 000 processes per VM. And I might have to figure out how to allow allow processes on the destination system to open hundreds or thousands of sockets. >> >> Should I resend the patch with a Tested-by: tag? > > ... Oh, wait. Not yet. The dawn just broke over ol' Marblehead here. I only tested TCP; I need to hammer UDP, too. > > Can I set the timeouts to zero? Or is one as low as I can go? I don't see any assertion or check against 0 sec timeouts. You can try. Your conntrack entries will be constantly flushing. > > N -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
(removed a bunch of people from CC list) On Mon, Oct 26, 2015 at 1:06 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote: > Then we can review and, if no major concerns, I can submit this to > -stable. Now that Neal has sufficiently tested the patches, is it OK to apply to -stable or do you guys want me to do anything more? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 04, 2015 at 03:46:54PM -0800, Ani Sinha wrote: > (removed a bunch of people from CC list) > > On Mon, Oct 26, 2015 at 1:06 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > Then we can review and, if no major concerns, I can submit this to > > -stable. > > Now that Neal has sufficiently tested the patches, is it OK to apply > to -stable or do you guys want me to do anything more? I'll be passing up this to -stable asap. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 9a46908..fd0f7a3 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -309,6 +309,21 @@ static void death_by_timeout(unsigned long ul_conntrack) nf_ct_put(ct); } +static inline bool +nf_ct_key_equal(struct nf_conntrack_tuple_hash *h, + const struct nf_conntrack_tuple *tuple, + u16 zone) +{ + struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h); + + /* A conntrack can be recreated with the equal tuple, + * so we need to check that the conntrack is confirmed + */ + return nf_ct_tuple_equal(tuple, &h->tuple) && + nf_ct_zone(ct) == zone && + nf_ct_is_confirmed(ct); +} + /* * Warning : * - Caller must take a reference on returned object @@ -330,8 +345,7 @@ ____nf_conntrack_find(struct net *net, u16 zone, local_bh_disable(); begin: hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[bucket], hnnode) { - if (nf_ct_tuple_equal(tuple, &h->tuple) && - nf_ct_zone(nf_ct_tuplehash_to_ctrack(h)) == zone) { + if (nf_ct_key_equal(h, tuple, zone)) { NF_CT_STAT_INC(net, found); local_bh_enable(); return h; @@ -378,8 +392,7 @@ begin: !atomic_inc_not_zero(&ct->ct_general.use))) h = NULL; else { - if (unlikely(!nf_ct_tuple_equal(tuple, &h->tuple) || - nf_ct_zone(ct) != zone)) { + if (unlikely(!nf_ct_key_equal(h, tuple, zone))) { nf_ct_put(ct); goto begin; }