diff mbox series

[iptables,07/14] nft Increase mnl_talk() receive buffer size

Message ID 20190916165000.18217-8-phil@nwl.cc
State Accepted
Delegated to: Pablo Neira
Headers show
Series Improve iptables-nft performance with large rulesets | expand

Commit Message

Phil Sutter Sept. 16, 2019, 4:49 p.m. UTC
This improves cache population quite a bit and therefore helps when
dealing with large rulesets. A simple hard to improve use-case is
listing the last rule in a large chain. These are the average program
run times depending on number of rules:

rule count	| legacy	| nft old	| nft new
---------------------------------------------------------
 50,000		| .052s		| .611s		| .406s
100,000		| .115s		| 2.12s		| 1.24s
150,000		| .265s		| 7.63s		| 4.14s
200,000		| .411s		| 21.0s		| 10.6s

So while legacy iptables is still magnitudes faster, this simple change
doubles iptables-nft performance in ideal cases.

Note that increasing the buffer even further didn't improve performance
anymore, so 32KB seems to be an upper limit in kernel space.

Signed-off-by: Phil Sutter <phil@nwl.cc>
---
 iptables/nft.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Pablo Neira Ayuso Sept. 17, 2019, 5 a.m. UTC | #1
On Mon, Sep 16, 2019 at 06:49:53PM +0200, Phil Sutter wrote:
> This improves cache population quite a bit and therefore helps when
> dealing with large rulesets. A simple hard to improve use-case is
> listing the last rule in a large chain.

You might consider extending the netlink interface too for this
particularly case, GETRULE plus position attribute could be used for
this I think. You won't be able to use this new operation from
userspace anytime soon though, given there is no way so far to expose
interface capabilities so far rather than probing.

If there are more particular corner cases like this, I would also
encourage to extend the netlink interface.

Just a side note, not a comment specifically on this patch :-).
Phil Sutter Sept. 17, 2019, 2:08 p.m. UTC | #2
Hi Pablo,

On Tue, Sep 17, 2019 at 07:00:38AM +0200, Pablo Neira Ayuso wrote:
> On Mon, Sep 16, 2019 at 06:49:53PM +0200, Phil Sutter wrote:
> > This improves cache population quite a bit and therefore helps when
> > dealing with large rulesets. A simple hard to improve use-case is
> > listing the last rule in a large chain.
> 
> You might consider extending the netlink interface too for this
> particularly case, GETRULE plus position attribute could be used for
> this I think. You won't be able to use this new operation from
> userspace anytime soon though, given there is no way so far to expose
> interface capabilities so far rather than probing.
> 
> If there are more particular corner cases like this, I would also
> encourage to extend the netlink interface.
> 
> Just a side note, not a comment specifically on this patch :-).

Thanks for suggesting, I didn't consider extending kernel to support the
index stuff yet. In general, I refrained from kernel changes simply
because of the compat problem. Implementing failure tolerance can
quickly turn into a mess, too.

Cheers, Phil
Pablo Neira Ayuso Sept. 20, 2019, 11:13 a.m. UTC | #3
On Mon, Sep 16, 2019 at 06:49:53PM +0200, Phil Sutter wrote:
> This improves cache population quite a bit and therefore helps when
> dealing with large rulesets. A simple hard to improve use-case is
> listing the last rule in a large chain. These are the average program
> run times depending on number of rules:
> 
> rule count	| legacy	| nft old	| nft new
> ---------------------------------------------------------
>  50,000		| .052s		| .611s		| .406s
> 100,000		| .115s		| 2.12s		| 1.24s
> 150,000		| .265s		| 7.63s		| 4.14s
> 200,000		| .411s		| 21.0s		| 10.6s
> 
> So while legacy iptables is still magnitudes faster, this simple change
> doubles iptables-nft performance in ideal cases.
> 
> Note that increasing the buffer even further didn't improve performance
> anymore, so 32KB seems to be an upper limit in kernel space.

Here are the details for this 32 KB number:

commit d35c99ff77ecb2eb239731b799386f3b3637a31e
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Oct 6 04:13:18 2016 +0900

    netlink: do not enter direct reclaim from netlink_dump()

iproute2 is also using 32 KBytes buffer, in case you want to append
this to your commit description before pushing this out.

> Signed-off-by: Phil Sutter <phil@nwl.cc>

Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter Sept. 23, 2019, 4:46 p.m. UTC | #4
Hi Pablo,

On Fri, Sep 20, 2019 at 01:13:29PM +0200, Pablo Neira Ayuso wrote:
> On Mon, Sep 16, 2019 at 06:49:53PM +0200, Phil Sutter wrote:
> > This improves cache population quite a bit and therefore helps when
> > dealing with large rulesets. A simple hard to improve use-case is
> > listing the last rule in a large chain. These are the average program
> > run times depending on number of rules:
> > 
> > rule count	| legacy	| nft old	| nft new
> > ---------------------------------------------------------
> >  50,000		| .052s		| .611s		| .406s
> > 100,000		| .115s		| 2.12s		| 1.24s
> > 150,000		| .265s		| 7.63s		| 4.14s
> > 200,000		| .411s		| 21.0s		| 10.6s
> > 
> > So while legacy iptables is still magnitudes faster, this simple change
> > doubles iptables-nft performance in ideal cases.
> > 
> > Note that increasing the buffer even further didn't improve performance
> > anymore, so 32KB seems to be an upper limit in kernel space.
> 
> Here are the details for this 32 KB number:

Thanks for those!

[...]
> iproute2 is also using 32 KBytes buffer, in case you want to append
> this to your commit description before pushing this out.

Commit message adjusted and pushed.

Thanks, Phil
diff mbox series

Patch

diff --git a/iptables/nft.c b/iptables/nft.c
index 6248b9eb10a85..7f0f9e1234ae4 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -101,7 +101,7 @@  int mnl_talk(struct nft_handle *h, struct nlmsghdr *nlh,
 	     void *data)
 {
 	int ret;
-	char buf[16536];
+	char buf[32768];
 
 	if (mnl_socket_sendto(h->nl, nlh, nlh->nlmsg_len) < 0)
 		return -1;