diff mbox series

[iptables,08/14] xtables-restore: Avoid cache population when flushing

Message ID 20190916165000.18217-9-phil@nwl.cc
State Changes Requested
Delegated to: Pablo Neira
Headers show
Series Improve iptables-nft performance with large rulesets | expand

Commit Message

Phil Sutter Sept. 16, 2019, 4:49 p.m. UTC
When called without --noflush, don't fetch full ruleset from kernel but
merely list of tables and the current genid. Locally, initialize chain
lists and set have_cache to simulate an empty ruleset.

Doing so reduces execution time significantly if a large ruleset exists
in kernel space. A simple test case consisting of a dump with 100,000
rules can be restored within 15s on my testing VM. Restoring it a second
time took 21s before this patch and only 17s after it.

Signed-off-by: Phil Sutter <phil@nwl.cc>
---
 iptables/nft.c             | 27 ++++++++++++++++++++++++++-
 iptables/nft.h             |  1 +
 iptables/xtables-restore.c |  7 +++++--
 3 files changed, 32 insertions(+), 3 deletions(-)

Comments

Pablo Neira Ayuso Sept. 20, 2019, 11:57 a.m. UTC | #1
Hi Phil,

On Mon, Sep 16, 2019 at 06:49:54PM +0200, Phil Sutter wrote:
> When called without --noflush, don't fetch full ruleset from kernel but
> merely list of tables and the current genid. Locally, initialize chain
> lists and set have_cache to simulate an empty ruleset.
> 
> Doing so reduces execution time significantly if a large ruleset exists
> in kernel space. A simple test case consisting of a dump with 100,000
> rules can be restored within 15s on my testing VM. Restoring it a second
> time took 21s before this patch and only 17s after it.
> 
> Signed-off-by: Phil Sutter <phil@nwl.cc>
> ---
>  iptables/nft.c             | 27 ++++++++++++++++++++++++++-
>  iptables/nft.h             |  1 +
>  iptables/xtables-restore.c |  7 +++++--
>  3 files changed, 32 insertions(+), 3 deletions(-)
> 
> diff --git a/iptables/nft.c b/iptables/nft.c
> index 7f0f9e1234ae4..820f3392dd495 100644
> --- a/iptables/nft.c
> +++ b/iptables/nft.c
> @@ -882,7 +882,8 @@ static int flush_cache(struct nft_cache *c, const struct builtin_table *tables,
>  		nftnl_chain_list_free(c->table[i].chains);
>  		c->table[i].chains = NULL;
>  	}
> -	nftnl_table_list_free(c->tables);
> +	if (c->tables)
> +		nftnl_table_list_free(c->tables);
>  	c->tables = NULL;
>  
>  	return 1;
> @@ -1617,6 +1618,30 @@ void nft_build_cache(struct nft_handle *h)
>  		__nft_build_cache(h);
>  }
>  
> +void nft_fake_cache(struct nft_handle *h)
> +{
> +	int i;
> +
> +	if (h->have_cache)
> +		return;
> +
> +	/* fetch tables so conditional table delete logic works */
> +	if (!h->cache->tables)
> +		fetch_table_cache(h);
> +
> +	for (i = 0; i < NFT_TABLE_MAX; i++) {
> +		enum nft_table_type type = h->tables[i].type;
> +
> +		if (!h->tables[i].name ||
> +		    h->cache->table[type].chains)
> +			continue;
> +
> +		h->cache->table[type].chains = nftnl_chain_list_alloc();
> +	}
> +	mnl_genid_get(h, &h->nft_genid);
> +	h->have_cache = true;
> +}

Looking at patches from 8/24 onwards, I think it's time to introduce
cache flags logic to iptables.

In patch 9/14 there is a new field have_chain_cache.

In patch 10/14 there is have_rule_cache.

Then moving on, the logic is based on checking for this booleans and
then checking if the caches are initialized or not.

I think if we move towards cache flags logic (the flags tell us what
if we need no cache, partial cache (only tables, tables + chains) or
full cache.

This would make this logic easier to understand and more maintainable.
Phil Sutter Sept. 24, 2019, 2:43 p.m. UTC | #2
Hi Pablo,

On Fri, Sep 20, 2019 at 01:57:02PM +0200, Pablo Neira Ayuso wrote:
[...]
> Looking at patches from 8/24 onwards, I think it's time to introduce
> cache flags logic to iptables.
> 
> In patch 9/14 there is a new field have_chain_cache.
> 
> In patch 10/14 there is have_rule_cache.
> 
> Then moving on, the logic is based on checking for this booleans and
> then checking if the caches are initialized or not.
> 
> I think if we move towards cache flags logic (the flags tell us what
> if we need no cache, partial cache (only tables, tables + chains) or
> full cache.
> 
> This would make this logic easier to understand and more maintainable.

I am not entirely sure this is a feasible approach:

On one hand, we certainly can introduce cache "levels" to distinguish
between:

- no cache
- tables
- chains
- rules

simply because all these naturally build upon each others. On the other
hand, in my series I pushed the envelope (and watched it bend) a bit
further: There are basically two modes of caching:

- "traditional" breadth-first, compatible with above: e.g. all tables,
  if needed all chains as well, etc.

- a new depth-first mode which allows to fetch e.g. a certain chain's
  rules but no other chains/rules.

While the first mode is fine, second one really gives us the edge in
situations where legacy iptables is faster simply because "number
crunching" is more optimized there.

This second mode doesn't have explicit completion indicators like the
first one (with 'have_cache', etc.). In fact, the code treats it like a
fire-and-forget situation: If e.g. the same chain is requested twice,
the second time cache is simply fetched again but nftnl_chain_list_cb()
discards the fetched chain if one with same name already exists in
table's chain list.

Actually, the only application which requires more attention is
xtables-restore with --noflush. It is able to run multiple commands
consecutively and these may need kernel ruleset as context. Still we
don't want to fetch the full kernel ruleset upon each invocation because
it's how users speed up their ruleset manipulations.

In summary, things boil down to the following options:

A) Avoid caching, fetch only the most necessary things to do the job.
B) Build a full cache if needed anyway or if we can't predict.
C) Create a fake cache if we know kernel's ruleset is irrelevant.

I'll give it another try, aligning cache update logic to the above and
see if things become clean enough to be considered maintainable.

Cheers, Phil
diff mbox series

Patch

diff --git a/iptables/nft.c b/iptables/nft.c
index 7f0f9e1234ae4..820f3392dd495 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -882,7 +882,8 @@  static int flush_cache(struct nft_cache *c, const struct builtin_table *tables,
 		nftnl_chain_list_free(c->table[i].chains);
 		c->table[i].chains = NULL;
 	}
-	nftnl_table_list_free(c->tables);
+	if (c->tables)
+		nftnl_table_list_free(c->tables);
 	c->tables = NULL;
 
 	return 1;
@@ -1617,6 +1618,30 @@  void nft_build_cache(struct nft_handle *h)
 		__nft_build_cache(h);
 }
 
+void nft_fake_cache(struct nft_handle *h)
+{
+	int i;
+
+	if (h->have_cache)
+		return;
+
+	/* fetch tables so conditional table delete logic works */
+	if (!h->cache->tables)
+		fetch_table_cache(h);
+
+	for (i = 0; i < NFT_TABLE_MAX; i++) {
+		enum nft_table_type type = h->tables[i].type;
+
+		if (!h->tables[i].name ||
+		    h->cache->table[type].chains)
+			continue;
+
+		h->cache->table[type].chains = nftnl_chain_list_alloc();
+	}
+	mnl_genid_get(h, &h->nft_genid);
+	h->have_cache = true;
+}
+
 static void __nft_flush_cache(struct nft_handle *h)
 {
 	if (!h->cache_index) {
diff --git a/iptables/nft.h b/iptables/nft.h
index 43463d7f262e8..d9eb30a2a2413 100644
--- a/iptables/nft.h
+++ b/iptables/nft.h
@@ -74,6 +74,7 @@  int mnl_talk(struct nft_handle *h, struct nlmsghdr *nlh,
 int nft_init(struct nft_handle *h, const struct builtin_table *t);
 void nft_fini(struct nft_handle *h);
 void nft_build_cache(struct nft_handle *h);
+void nft_fake_cache(struct nft_handle *h);
 
 /*
  * Operations with tables.
diff --git a/iptables/xtables-restore.c b/iptables/xtables-restore.c
index f930f5ba2d167..d1486afedc480 100644
--- a/iptables/xtables-restore.c
+++ b/iptables/xtables-restore.c
@@ -96,6 +96,11 @@  void xtables_restore_parse(struct nft_handle *h,
 
 	line = 0;
 
+	if (h->noflush)
+		nft_build_cache(h);
+	else
+		nft_fake_cache(h);
+
 	/* Grab standard input. */
 	while (fgets(buffer, sizeof(buffer), p->in)) {
 		int ret = 0;
@@ -146,8 +151,6 @@  void xtables_restore_parse(struct nft_handle *h,
 			if (p->tablename && (strcmp(p->tablename, table) != 0))
 				continue;
 
-			nft_build_cache(h);
-
 			if (h->noflush == 0) {
 				DEBUGP("Cleaning all chains of table '%s'\n",
 					table);