diff mbox

[nf-next,3/9] netfilter: conntrack: don't attempt to iterate over empty table

Message ID 1461863628-23350-4-git-send-email-fw@strlen.de
State Not Applicable, archived
Headers show

Commit Message

Florian Westphal April 28, 2016, 5:13 p.m. UTC
Once we place all conntracks into same table iteration becomes more
costly because the table contains conntracks that we are not interested
in (belonging to other netns).

So don't bother scanning if the current namespace has no entries.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nf_conntrack_core.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Pablo Neira Ayuso May 3, 2016, 5:03 p.m. UTC | #1
On Thu, Apr 28, 2016 at 07:13:42PM +0200, Florian Westphal wrote:
> Once we place all conntracks into same table iteration becomes more
> costly because the table contains conntracks that we are not interested
> in (belonging to other netns).
> 
> So don't bother scanning if the current namespace has no entries.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  net/netfilter/nf_conntrack_core.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> index 29fa08b..f2e75a5 100644
> --- a/net/netfilter/nf_conntrack_core.c
> +++ b/net/netfilter/nf_conntrack_core.c
> @@ -1428,6 +1428,9 @@ void nf_ct_iterate_cleanup(struct net *net,
>  
>  	might_sleep();
>  
> +	if (atomic_read(&net->ct.count) == 0)
> +		return;

This optimization gets defeated with just one single conntrack (ie.
net->ct.count == 1), so I wonder if this is practical thing.

At the cost of consuming more memory per conntrack, we may consider
adding a per-net list so this iteration doesn't become a problem.

>  	while ((ct = get_next_corpse(net, iter, data, &bucket)) != NULL) {
>  		/* Time to push up daises... */
>  		if (del_timer(&ct->timeout))
> -- 
> 2.7.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Florian Westphal May 3, 2016, 5:17 p.m. UTC | #2
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Thu, Apr 28, 2016 at 07:13:42PM +0200, Florian Westphal wrote:
> > Once we place all conntracks into same table iteration becomes more
> > costly because the table contains conntracks that we are not interested
> > in (belonging to other netns).
> > 
> > So don't bother scanning if the current namespace has no entries.
> > 
> > Signed-off-by: Florian Westphal <fw@strlen.de>
> > ---
> >  net/netfilter/nf_conntrack_core.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> > index 29fa08b..f2e75a5 100644
> > --- a/net/netfilter/nf_conntrack_core.c
> > +++ b/net/netfilter/nf_conntrack_core.c
> > @@ -1428,6 +1428,9 @@ void nf_ct_iterate_cleanup(struct net *net,
> >  
> >  	might_sleep();
> >  
> > +	if (atomic_read(&net->ct.count) == 0)
> > +		return;
> 
> This optimization gets defeated with just one single conntrack (ie.
> net->ct.count == 1), so I wonder if this is practical thing.

I was thinking of the cleanup we do in the netns exit path
(in nf_conntrack_cleanup_net_list() ).

If you don't like this I can move the check here:

i_see_dead_people:
    busy = 0;
    list_for_each_entry(net, net_exit_list, exit_list) {
    // here
    if (atomic_read .. > 0)
       nf_ct_iterate_cleanup(net, kill_all, ...

> At the cost of consuming more memory per conntrack, we may consider
> adding a per-net list so this iteration doesn't become a problem.

I don't think that will be needed.   We don't have any such iterations
in the fast path.

For dumps via ctnetlink it shouldn't be a big deal either, if needed
we can optimize that to use rcu readlocks only and 'upgrade' to locked
path only when we want to dump the candidate ct.
for deferred pruning).

early_drop will go away soon (i'll rework it to do the early_drop from
work queue).
Pablo Neira Ayuso May 3, 2016, 5:41 p.m. UTC | #3
On Tue, May 03, 2016 at 07:17:44PM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > On Thu, Apr 28, 2016 at 07:13:42PM +0200, Florian Westphal wrote:
> > > Once we place all conntracks into same table iteration becomes more
> > > costly because the table contains conntracks that we are not interested
> > > in (belonging to other netns).
> > > 
> > > So don't bother scanning if the current namespace has no entries.
> > > 
> > > Signed-off-by: Florian Westphal <fw@strlen.de>
> > > ---
> > >  net/netfilter/nf_conntrack_core.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> > > index 29fa08b..f2e75a5 100644
> > > --- a/net/netfilter/nf_conntrack_core.c
> > > +++ b/net/netfilter/nf_conntrack_core.c
> > > @@ -1428,6 +1428,9 @@ void nf_ct_iterate_cleanup(struct net *net,
> > >  
> > >  	might_sleep();
> > >  
> > > +	if (atomic_read(&net->ct.count) == 0)
> > > +		return;
> > 
> > This optimization gets defeated with just one single conntrack (ie.
> > net->ct.count == 1), so I wonder if this is practical thing.
> 
> I was thinking of the cleanup we do in the netns exit path
> (in nf_conntrack_cleanup_net_list() ).

Right, but in that path we still have entries in the table.

> If you don't like this I can move the check here:
> 
> i_see_dead_people:
>     busy = 0;
>     list_for_each_entry(net, net_exit_list, exit_list) {
>     // here
>     if (atomic_read .. > 0)
>        nf_ct_iterate_cleanup(net, kill_all, ...

I don't mind about placing this or there, as I said, my question is
how often we will hit this optimization in a real scenario.

If you think the answer is often, then this will help.

Otherwise, every time we'll go container destruction path, we'll hit
slow path, ie.  scanning the full table.

> > At the cost of consuming more memory per conntrack, we may consider
> > adding a per-net list so this iteration doesn't become a problem.
> 
> I don't think that will be needed.   We don't have any such iterations
> in the fast path.
>
> For dumps via ctnetlink it shouldn't be a big deal either, if needed
> we can optimize that to use rcu readlocks only and 'upgrade' to locked
> path only when we want to dump the candidate ct.
> for deferred pruning).
> early_drop will go away soon (i'll rework it to do the early_drop from
> work queue).

OK.
Florian Westphal May 3, 2016, 5:55 p.m. UTC | #4
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > I was thinking of the cleanup we do in the netns exit path
> > (in nf_conntrack_cleanup_net_list() ).
> 
> Right, but in that path we still have entries in the table.

Not necessarily, they might have already been removed
(timeout, close).

> > If you don't like this I can move the check here:
> > 
> > i_see_dead_people:
> >     busy = 0;
> >     list_for_each_entry(net, net_exit_list, exit_list) {
> >     // here
> >     if (atomic_read .. > 0)
> >        nf_ct_iterate_cleanup(net, kill_all, ...
> 
> I don't mind about placing this or there, as I said, my question is
> how often we will hit this optimization in a real scenario.
> 
> If you think the answer is often, then this will help.

I think the extra atomic_read in this code does no harm and
saves us the entire scan.  Also, in the exit path, when we hit the
'i_see_dead_people' label we restart the entire loop, so if we
have 200 netns on the list and the last one caused that restart,
we re-iterate needlesly for 199 netns...

> Otherwise, every time we'll go container destruction path, we'll hit
> slow path, ie.  scanning the full table.

Yes, but I see no other choice.
Pablo Neira Ayuso May 3, 2016, 10:27 p.m. UTC | #5
On Tue, May 03, 2016 at 07:55:59PM +0200, Florian Westphal wrote:
> > Otherwise, every time we'll go container destruction path, we'll hit
> > slow path, ie.  scanning the full table.
> 
> Yes, but I see no other choice.

Fair enough, will place this in nf-next, thanks.
diff mbox

Patch

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 29fa08b..f2e75a5 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1428,6 +1428,9 @@  void nf_ct_iterate_cleanup(struct net *net,
 
 	might_sleep();
 
+	if (atomic_read(&net->ct.count) == 0)
+		return;
+
 	while ((ct = get_next_corpse(net, iter, data, &bucket)) != NULL) {
 		/* Time to push up daises... */
 		if (del_timer(&ct->timeout))