diff mbox

[1/1,linux-next] netfilter: conntrack: fix kmemleak false positive

Message ID 20160921210253.GB24153@breakpoint.cc
State Superseded
Delegated to: Pablo Neira
Headers show

Commit Message

Florian Westphal Sept. 21, 2016, 9:02 p.m. UTC
Fabian Frederick <fabf@skynet.be> wrote:
> Since commit f330a7fdbe16
> ("netfilter: conntrack: get rid of conntrack timer")
> 
> closed connections remain longer in /proc/net/nf_conntrack
> 
> Running current kernel; just after boot:
> cat /proc/net/nf_conntrack | wc -l = 5
> 4 minutes required to clean up the table.

We should reap the stale entries while iterating, just like
we do for ctnetlink interface.

Can you try this patch?


> Going back to kernel version before commit above there are
> no connections after some seconds.
> 
> Referring to the commit changelog this was an expected behaviour but
> it results in temporary kmemleak reports:

I don't see kmemleak complaints on my test vm, I'm reluctant to
turn it off.

Can you explain why we see such 'false positive'?

The conntracks should still be referenced, as they
are in main table.

> unreferenced object 0xffff88003b0e6600 (size 248):
>   comm "rsyslogd", pid 1595, jiffies 4294741312 (age 7.343s)
>   ...
>   backtrace:
>     [<ffffffff818f0163>] kmemleak_alloc+0x23/0x40
>     [<ffffffff8117abf9>] kmem_cache_alloc+0xd9/0x180
>     [<ffffffff817533a8>] __nf_conntrack_alloc.isra.50+0x48/0x170
>     [<ffffffff81754262>] nf_conntrack_in+0x3a2/0x5f0
>     [<ffffffff817bc9e0>] ipv4_conntrack_local+0x40/0x50
>     [<ffffffff8174f17d>] nf_iterate+0x5d/0x70
>     [<ffffffff8174f1ef>] nf_hook_slow+0x5f/0xb0
>     [<ffffffff8176ffbd>] __ip_local_out+0xad/0xe0
>     [<ffffffff81770007>] ip_local_out+0x17/0x40
>     [<ffffffff817711f4>] ip_send_skb+0x14/0x40
>     [<ffffffff81797591>] udp_send_skb+0x91/0x260
>     [<ffffffff817985e5>] udp_sendmsg+0x2f5/0x950
>     [<ffffffff817a4f80>] inet_sendmsg+0x60/0x90
>     [<ffffffff816fe7d3>] sock_sendmsg+0x33/0x40
>     [<ffffffff816fed0e>] SYSC_sendto+0xee/0x160
>     [<ffffffff816ff719>] SyS_sendto+0x9/0x10
> 
> (248 bytes being an nf_conn structure)
> 
> Those structures being cleared in gc_worker() later on we can't talk
> about unreferenced object so this patch uses kmemleak_not_leak() to
> prevent those warnings.

If thats the case, why is kmemleak complaining? Are you sure this
is a false positive?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Fabian Frederick Sept. 22, 2016, 5:55 p.m. UTC | #1
> On 21 September 2016 at 23:02 Florian Westphal <fw@strlen.de> wrote:
>
>
> Fabian Frederick <fabf@skynet.be> wrote:
> > Since commit f330a7fdbe16
> > ("netfilter: conntrack: get rid of conntrack timer")
> >
> > closed connections remain longer in /proc/net/nf_conntrack
> >
> > Running current kernel; just after boot:
> > cat /proc/net/nf_conntrack | wc -l = 5
> > 4 minutes required to clean up the table.
>
> We should reap the stale entries while iterating, just like
> we do for ctnetlink interface.
>
> Can you try this patch?
>
> diff --git a/net/netfilter/nf_conntrack_standalone.c
> b/net/netfilter/nf_conntrack_standalone.c
> --- a/net/netfilter/nf_conntrack_standalone.c
> +++ b/net/netfilter/nf_conntrack_standalone.c
> @@ -212,6 +212,11 @@ static int ct_seq_show(struct seq_file *s, void *v)
>       if (unlikely(!atomic_inc_not_zero(&ct->ct_general.use)))
>               return 0;

> +     if (nf_ct_should_gc(ct)) {
> +             nf_ct_kill(ct);
> +             goto release;
> +     }
> +
>       /* we only want to print DIR_ORIGINAL */
>       if (NF_CT_DIRECTION(hash))
>               goto release;
>

Hello Florian,

        First problem is solved: table gets cleared 3 minutes earlier
but I still have kmemleak before running the following:

echo scan > /sys/kernel/debug/kmemleak
cat /sys/kernel/debug/kmemleak
Nothing
echo scan > /sys/kernel/debug/kmemleak
cat /sys/kernel/debug/kmemleak
-> rsyslogd

I talked about false positive because everything is cleared later.

Note that problem appears only in a VM which is really slow due to
ksoftirqd eating lot of CPU for an unknown reason. Maybe you should test
somewhere else before applying.

Regards,
Fabian

> > Going back to kernel version before commit above there are
> > no connections after some seconds.
> >
> > Referring to the commit changelog this was an expected behaviour but
> > it results in temporary kmemleak reports:
>
> I don't see kmemleak complaints on my test vm, I'm reluctant to
> turn it off.
>
> Can you explain why we see such 'false positive'?
>
> The conntracks should still be referenced, as they
> are in main table.
>
> > unreferenced object 0xffff88003b0e6600 (size 248):
> >   comm "rsyslogd", pid 1595, jiffies 4294741312 (age 7.343s)
> >   ...
> >   backtrace:
> >     [] kmemleak_alloc+0x23/0x40
> >     [] kmem_cache_alloc+0xd9/0x180
> >     [] __nf_conntrack_alloc.isra.50+0x48/0x170
> >     [] nf_conntrack_in+0x3a2/0x5f0
> >     [] ipv4_conntrack_local+0x40/0x50
> >     [] nf_iterate+0x5d/0x70
> >     [] nf_hook_slow+0x5f/0xb0
> >     [] __ip_local_out+0xad/0xe0
> >     [] ip_local_out+0x17/0x40
> >     [] ip_send_skb+0x14/0x40
> >     [] udp_send_skb+0x91/0x260
> >     [] udp_sendmsg+0x2f5/0x950
> >     [] inet_sendmsg+0x60/0x90
> >     [] sock_sendmsg+0x33/0x40
> >     [] SYSC_sendto+0xee/0x160
> >     [] SyS_sendto+0x9/0x10
> >
> > (248 bytes being an nf_conn structure)
> >
> > Those structures being cleared in gc_worker() later on we can't talk
> > about unreferenced object so this patch uses kmemleak_not_leak() to
> > prevent those warnings.
>
> If thats the case, why is kmemleak complaining? Are you sure this
> is a false positive?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Florian Westphal Sept. 22, 2016, 9:56 p.m. UTC | #2
Fabian Frederick <fabf@skynet.be> wrote:
> Hello Florian,
> 
>         First problem is solved: table gets cleared 3 minutes earlier
> but I still have kmemleak before running the following:
> 
> echo scan > /sys/kernel/debug/kmemleak
> cat /sys/kernel/debug/kmemleak
> Nothing
> echo scan > /sys/kernel/debug/kmemleak
> cat /sys/kernel/debug/kmemleak
> -> rsyslogd
> 
> I talked about false positive because everything is cleared later.

Hmm, I fear this is a real bug and not false positive.

Should be possible to confirm this via slabinfo:

grep nf_conntrack /proc/slabinfo

The active objects should match the conntrack count.
(conntrack -C, or wc -l < /proc/....).

> > > unreferenced object 0xffff88003b0e6600 (size 248):
> > >   comm "rsyslogd", pid 1595, jiffies 4294741312 (age 7.343s)
> > >   ...
> > >   backtrace:
> > >     [] kmemleak_alloc+0x23/0x40
> > >     [] kmem_cache_alloc+0xd9/0x180
> > >     [] __nf_conntrack_alloc.isra.50+0x48/0x170
> > >     [] nf_conntrack_in+0x3a2/0x5f0
> > >     [] ipv4_conntrack_local+0x40/0x50
> > >     [] nf_iterate+0x5d/0x70
> > >     [] nf_hook_slow+0x5f/0xb0
> > >     [] __ip_local_out+0xad/0xe0
> > >     [] ip_local_out+0x17/0x40
> > >     [] ip_send_skb+0x14/0x40
> > >     [] udp_send_skb+0x91/0x260
> > >     [] udp_sendmsg+0x2f5/0x950
> > >     [] inet_sendmsg+0x60/0x90
> > >     [] sock_sendmsg+0x33/0x40
> > >     [] SYSC_sendto+0xee/0x160
> > >     [] SyS_sendto+0x9/0x10

Hmm, so we leak when allocating conntrack for outgoing packet.
Do you do any filtering (DROP) in output/postrouting?

> > > (248 bytes being an nf_conn structure)
> > >
> > > Those structures being cleared in gc_worker() later on we can't talk
> > > about unreferenced object so this patch uses kmemleak_not_leak() to
> > > prevent those warnings.
> >
> > If thats the case, why is kmemleak complaining? Are you sure this
> > is a false positive?

Looks like a real bug to me, but I don't see anything obvious so far.
I'll look at this again tomorrow.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Fabian Frederick Sept. 23, 2016, 7:40 p.m. UTC | #3
> On 22 September 2016 at 23:56 Florian Westphal <fw@strlen.de> wrote:
>
>
> Fabian Frederick <fabf@skynet.be> wrote:
> > Hello Florian,
> >
> >         First problem is solved: table gets cleared 3 minutes earlier
> > but I still have kmemleak before running the following:
> >
> > echo scan > /sys/kernel/debug/kmemleak
> > cat /sys/kernel/debug/kmemleak
> > Nothing
> > echo scan > /sys/kernel/debug/kmemleak
> > cat /sys/kernel/debug/kmemleak
> > -> rsyslogd
> >
> > I talked about false positive because everything is cleared later.
>
> Hmm, I fear this is a real bug and not false positive.
>
> Should be possible to confirm this via slabinfo:
>
> grep nf_conntrack /proc/slabinfo
>
> The active objects should match the conntrack count.
> (conntrack -C, or wc -l < /proc/....).

Softirq latencies have been solved today in linux-next: commit
60bf815c0d08
("Revert "softirq: fix tasklet_kill() and its users"")

but nf_conntrack still needs a lot of time to reach 0

Slabinfo always remains the following:

/proc/slabinfo: 
nf_conntrack          16     16    256   16

even when /proc/net/nf_conntrack is 0

Going back to kernel version before commit f330a7fdbe16
It's the same with 12 active objects:

/proc/slabinfo: 
nf_conntrack          12     12    320  12

>
> > > > unreferenced object 0xffff88003b0e6600 (size 248):
> > > >   comm "rsyslogd", pid 1595, jiffies 4294741312 (age 7.343s)
> > > >   ...
> > > >   backtrace:
> > > >     [] kmemleak_alloc+0x23/0x40
> > > >     [] kmem_cache_alloc+0xd9/0x180
> > > >     [] __nf_conntrack_alloc.isra.50+0x48/0x170
> > > >     [] nf_conntrack_in+0x3a2/0x5f0
> > > >     [] ipv4_conntrack_local+0x40/0x50
> > > >     [] nf_iterate+0x5d/0x70
> > > >     [] nf_hook_slow+0x5f/0xb0
> > > >     [] __ip_local_out+0xad/0xe0
> > > >     [] ip_local_out+0x17/0x40
> > > >     [] ip_send_skb+0x14/0x40
> > > >     [] udp_send_skb+0x91/0x260
> > > >     [] udp_sendmsg+0x2f5/0x950
> > > >     [] inet_sendmsg+0x60/0x90
> > > >     [] sock_sendmsg+0x33/0x40
> > > >     [] SYSC_sendto+0xee/0x160
> > > >     [] SyS_sendto+0x9/0x10
>
> Hmm, so we leak when allocating conntrack for outgoing packet.
> Do you do any filtering (DROP) in output/postrouting?
iptables -L gives ACCEPT for all.

>
> > > > (248 bytes being an nf_conn structure)
> > > >
> > > > Those structures being cleared in gc_worker() later on we can't talk
> > > > about unreferenced object so this patch uses kmemleak_not_leak() to
> > > > prevent those warnings.
> > >
> > > If thats the case, why is kmemleak complaining? Are you sure this
> > > is a false positive?
>
> Looks like a real bug to me, but I don't see anything obvious so far.
> I'll look at this again tomorrow.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c
--- a/net/netfilter/nf_conntrack_standalone.c
+++ b/net/netfilter/nf_conntrack_standalone.c
@@ -212,6 +212,11 @@  static int ct_seq_show(struct seq_file *s, void *v)
 	if (unlikely(!atomic_inc_not_zero(&ct->ct_general.use)))
 		return 0;
 
+	if (nf_ct_should_gc(ct)) {
+		nf_ct_kill(ct);
+		goto release;
+	}
+
 	/* we only want to print DIR_ORIGINAL */
 	if (NF_CT_DIRECTION(hash))
 		goto release;