diff mbox series

[iptables,v4,4/5] xtables: Fix for inserting rule at wrong position

Message ID 20181230190612.29413-5-phil@nwl.cc
State Changes Requested
Delegated to: Pablo Neira
Headers show
Series Separate rule cache per chain et al. | expand

Commit Message

Phil Sutter Dec. 30, 2018, 7:06 p.m. UTC
iptables-restore allows to insert rules at a certain position which is
problematic for iptables-nft to realize since rule position is not
determined by number but handle of previous or following rule.

To fix this, make use of the rule cache (which holds in-kernel rules
with their associated handle already). Insert new rules at the right
position into that cache, then at commit time (i.e., in nft_action())
traverse each chain's rule list for new rules to add:

* New rules at beginning of list are inserted in reverse order without
  reference of another rule, so that each consecutive rule is added
  before all previous ones.

* Each time an "old" (i.e., in-kernel) rule is encountered, its handle
  is stored (overwriting any previous ones').

* New rules following an old one are appended to the old one (by
  specifying its handle) in reverse order, so that each consecutive rule
  is inserted between the old one and previously appended ones.

For this to function, built-in chains need to be in cache. Consequently,
batch jobs with type NFT_COMPAT_CHAIN_ADD must not delete their payload.
A nice side-effect of that is when initializing builtin tables/chains,
no explicit call to nft_commit() is required anymore.

Signed-off-by: Phil Sutter <phil@nwl.cc>
---
 iptables/nft.c                                | 182 +++++++++++-------
 .../ipt-restore/0003-restore-ordering_0       |  94 +++++++++
 .../testcases/iptables/0005-rule-replace_0    |  38 ++++
 3 files changed, 243 insertions(+), 71 deletions(-)
 create mode 100755 iptables/tests/shell/testcases/ipt-restore/0003-restore-ordering_0
 create mode 100755 iptables/tests/shell/testcases/iptables/0005-rule-replace_0

Comments

Pablo Neira Ayuso Jan. 9, 2019, 11:50 p.m. UTC | #1
Hi Phil,

On Sun, Dec 30, 2018 at 08:06:11PM +0100, Phil Sutter wrote:
> iptables-restore allows to insert rules at a certain position which is
> problematic for iptables-nft to realize since rule position is not
> determined by number but handle of previous or following rule.

Looking at this late postprocessing in nft_action()... a bit of
brainstorming.

Why not same approach as in rule_translate_index() in nftables? I
guess answer is this is also incomplete, right?

If that is not enough, we could add support for NFTA_RULE_ID to
nf_tables_newrule().  This NFTA_RULE_ID attribute provides a way to
refer to rules coming in this batch. It is similar to NFTA_SET_ID,
however, currently it is only supported from the rule deletion path.
This ID is internal to the transaction - we can freely allocate IDs
for the new rules coming in the batch. My concern is that we may have
problems later on if we do not address limitations in the kernel
interface.

Or thinking if we can do this with the existing interface...  see
below.

> To fix this, make use of the rule cache (which holds in-kernel rules
> with their associated handle already). Insert new rules at the right
> position into that cache, then at commit time (i.e., in nft_action())
> traverse each chain's rule list for new rules to add:
> 
> * New rules at beginning of list are inserted in reverse order without
>   reference of another rule, so that each consecutive rule is added
>   before all previous ones.

I think this refers to rules like:

iptables -I chain 1 ...

This case we can just handle it by doing plain rule insertion.

> * Each time an "old" (i.e., in-kernel) rule is encountered, its handle
>   is stored (overwriting any previous ones').
>
> * New rules following an old one are appended to the old one (by
>   specifying its handle) in reverse order, so that each consecutive rule
>   is inserted between the old one and previously appended ones.

For existing rules, we already have an absolute reference through the
handle, then the NLM_F_APPEND flag tells if we want to place it before
or after that rule.

What scenario is forcing us to do this late postprocessing below that
we cannot handle with the existing interface? :-)

Thanks!

[...]
> @@ -2545,6 +2495,78 @@ static void batch_obj_del(struct nft_handle *h, struct obj_update *o)
>  	free(o);
>  }
>  
> +struct rule_stack_data {
> +	struct nft_handle *h;
> +	uint64_t handle;
> +	uint32_t seq;
> +};
> +
> +static int rule_stack_cb(struct nftnl_rule *r, void *data)
> +{
> +	struct rule_stack_data *rsd = data;
> +	uint16_t flags = NLM_F_CREATE;
> +
> +	if (rsd->handle) {
> +		/* append rule to previous in-kernel one */
> +		flags |= NLM_F_APPEND;
> +		nftnl_rule_set_u64(r, NFTNL_RULE_POSITION, rsd->handle);
> +	}
> +	nft_compat_rule_batch_add(rsd->h, NFT_MSG_NEWRULE,
> +				  flags, rsd->seq++, r);
> +	mnl_nft_batch_continue(rsd->h->batch);
> +	nftnl_rule_list_del(r);
> +	nftnl_rule_free(r);
> +	return 0;
> +}
> +
> +struct batch_from_chain_cache_data {
> +	struct nft_handle	*handle;
> +	uint32_t		seq;
> +};
> +
> +static int batch_from_chain_cache(struct nftnl_chain *c, void *data)
> +{
> +	struct batch_from_chain_cache_data *d = data;
> +	struct rule_stack_data rsd = {
> +		.h = d->handle,
> +		.seq = d->seq,
> +	};
> +	struct nftnl_rule_list *stack;
> +	struct nftnl_rule_iter *iter;
> +	struct nftnl_rule *r;
> +
> +	stack = nftnl_rule_list_alloc();
> +	if (!stack)
> +		return -1;
> +
> +	iter = nftnl_rule_iter_create(c);
> +	if (!iter)
> +		return -1;
> +
> +	r = nftnl_rule_iter_next(iter);
> +	while (r) {
> +		uint64_t handle = nftnl_rule_get_u64(r, NFTNL_RULE_HANDLE);
> +
> +		if (!handle) {
> +			/* new rule to be added, put onto stack */
> +			nftnl_rule_list_del(r);
> +			nftnl_rule_list_add(r, stack);
> +		} else {
> +			/* rule in kernel already, handle rules in stack */
> +			nftnl_rule_list_foreach(stack, rule_stack_cb, &rsd);
> +			d->seq = rsd.seq;
> +			rsd.handle = handle;
> +		}
> +		r = nftnl_rule_iter_next(iter);
> +	}
> +	nftnl_rule_iter_destroy(iter);
> +
> +	nftnl_rule_list_foreach(stack, rule_stack_cb, &rsd);
> +	nftnl_rule_list_free(stack);
> +	d->seq = rsd.seq;
> +	return 0;
> +}
> +
>  static int nft_action(struct nft_handle *h, int action)
>  {
>  	struct obj_update *n, *tmp;
> @@ -2628,6 +2650,24 @@ static int nft_action(struct nft_handle *h, int action)
>  		mnl_nft_batch_continue(h->batch);
>  	}
>  
> +	if (h->have_cache) {
> +		struct batch_from_chain_cache_data d = {
> +			.handle = h,
> +			.seq = seq,
> +		};
> +
> +		for (i = 0; i < NFT_TABLE_MAX; i++) {
> +			const char *cur_table = h->tables[i].name;
> +
> +			if (!cur_table)
> +				continue;
> +
> +			nftnl_chain_list_foreach(h->table[i].chain_cache,
> +						 batch_from_chain_cache, &d);
> +		}
> +		seq = d.seq;
> +	}
> +
>  	switch (action) {
>  	case NFT_COMPAT_COMMIT:
>  		mnl_batch_end(h->batch, seq++);
> diff --git a/iptables/tests/shell/testcases/ipt-restore/0003-restore-ordering_0 b/iptables/tests/shell/testcases/ipt-restore/0003-restore-ordering_0
> new file mode 100755
> index 0000000000000..44ee796ef44df
> --- /dev/null
> +++ b/iptables/tests/shell/testcases/ipt-restore/0003-restore-ordering_0
> @@ -0,0 +1,94 @@
> +#!/bin/bash
> +
> +# Make sure iptables-restore does the right thing
> +# when encountering INSERT rules with index.
> +
> +set -e
> +
> +# show rules, drop uninteresting policy settings
> +ipt_show() {
> +	$XT_MULTI iptables -S | grep -v '^-P'
> +}
> +
> +# basic issue reproducer
> +
> +$XT_MULTI iptables-restore <<EOF
> +*filter
> +-A FORWARD -m comment --comment "appended rule" -j ACCEPT
> +-I FORWARD 1 -m comment --comment "rule 1" -j ACCEPT
> +-I FORWARD 2 -m comment --comment "rule 2" -j ACCEPT
> +-I FORWARD 3 -m comment --comment "rule 3" -j ACCEPT
> +COMMIT
> +EOF
> +
> +EXPECT='-A FORWARD -m comment --comment "rule 1" -j ACCEPT
> +-A FORWARD -m comment --comment "rule 2" -j ACCEPT
> +-A FORWARD -m comment --comment "rule 3" -j ACCEPT
> +-A FORWARD -m comment --comment "appended rule" -j ACCEPT'
> +
> +diff -u -Z <(echo -e "$EXPECT") <(ipt_show)
> +
> +# insert rules into existing ruleset
> +
> +$XT_MULTI iptables-restore --noflush <<EOF
> +*filter
> +-I FORWARD 1 -m comment --comment "rule 0.5" -j ACCEPT
> +-I FORWARD 3 -m comment --comment "rule 1.5" -j ACCEPT
> +-I FORWARD 5 -m comment --comment "rule 2.5" -j ACCEPT
> +-I FORWARD 7 -m comment --comment "rule 3.5" -j ACCEPT
> +-I FORWARD 9 -m comment --comment "appended rule 2" -j ACCEPT
> +COMMIT
> +EOF
> +
> +EXPECT='-A FORWARD -m comment --comment "rule 0.5" -j ACCEPT
> +-A FORWARD -m comment --comment "rule 1" -j ACCEPT
> +-A FORWARD -m comment --comment "rule 1.5" -j ACCEPT
> +-A FORWARD -m comment --comment "rule 2" -j ACCEPT
> +-A FORWARD -m comment --comment "rule 2.5" -j ACCEPT
> +-A FORWARD -m comment --comment "rule 3" -j ACCEPT
> +-A FORWARD -m comment --comment "rule 3.5" -j ACCEPT
> +-A FORWARD -m comment --comment "appended rule" -j ACCEPT
> +-A FORWARD -m comment --comment "appended rule 2" -j ACCEPT'
> +
> +diff -u -Z <(echo -e "$EXPECT") <(ipt_show)
> +
> +# insert rules in between added ones
> +
> +$XT_MULTI iptables-restore <<EOF
> +*filter
> +-A FORWARD -m comment --comment "appended rule 1" -j ACCEPT
> +-A FORWARD -m comment --comment "appended rule 2" -j ACCEPT
> +-A FORWARD -m comment --comment "appended rule 3" -j ACCEPT
> +-I FORWARD 1 -m comment --comment "rule 1" -j ACCEPT
> +-I FORWARD 3 -m comment --comment "rule 2" -j ACCEPT
> +-I FORWARD 5 -m comment --comment "rule 3" -j ACCEPT
> +COMMIT
> +EOF
> +
> +EXPECT='-A FORWARD -m comment --comment "rule 1" -j ACCEPT
> +-A FORWARD -m comment --comment "appended rule 1" -j ACCEPT
> +-A FORWARD -m comment --comment "rule 2" -j ACCEPT
> +-A FORWARD -m comment --comment "appended rule 2" -j ACCEPT
> +-A FORWARD -m comment --comment "rule 3" -j ACCEPT
> +-A FORWARD -m comment --comment "appended rule 3" -j ACCEPT'
> +
> +diff -u -Z <(echo -e "$EXPECT") <(ipt_show)
> +
> +# test rule deletion in dump files
> +
> +$XT_MULTI iptables-restore --noflush <<EOF
> +*filter
> +-D FORWARD -m comment --comment "appended rule 1" -j ACCEPT
> +-D FORWARD 3
> +-I FORWARD 3 -m comment --comment "manually replaced rule 2" -j ACCEPT
> +COMMIT
> +EOF
> +
> +EXPECT='-A FORWARD -m comment --comment "rule 1" -j ACCEPT
> +-A FORWARD -m comment --comment "rule 2" -j ACCEPT
> +-A FORWARD -m comment --comment "manually replaced rule 2" -j ACCEPT
> +-A FORWARD -m comment --comment "rule 3" -j ACCEPT
> +-A FORWARD -m comment --comment "appended rule 3" -j ACCEPT'
> +
> +diff -u -Z <(echo -e "$EXPECT") <(ipt_show)
> +
> diff --git a/iptables/tests/shell/testcases/iptables/0005-rule-replace_0 b/iptables/tests/shell/testcases/iptables/0005-rule-replace_0
> new file mode 100755
> index 0000000000000..5a3e922e50672
> --- /dev/null
> +++ b/iptables/tests/shell/testcases/iptables/0005-rule-replace_0
> @@ -0,0 +1,38 @@
> +#!/bin/bash
> +
> +# test rule replacement
> +
> +set -e
> +
> +# show rules, drop uninteresting policy settings
> +ipt_show() {
> +	$XT_MULTI iptables -S | grep -v '^-P'
> +}
> +
> +$XT_MULTI iptables -A FORWARD -m comment --comment "rule 1" -j ACCEPT
> +$XT_MULTI iptables -A FORWARD -m comment --comment "rule 2" -j ACCEPT
> +$XT_MULTI iptables -A FORWARD -m comment --comment "rule 3" -j ACCEPT
> +
> +$XT_MULTI iptables -R FORWARD 2 -m comment --comment "replaced 2" -j ACCEPT
> +
> +EXPECT='-A FORWARD -m comment --comment "rule 1" -j ACCEPT
> +-A FORWARD -m comment --comment "replaced 2" -j ACCEPT
> +-A FORWARD -m comment --comment "rule 3" -j ACCEPT'
> +
> +diff -u -Z <(echo -e "$EXPECT") <(ipt_show)
> +
> +$XT_MULTI iptables -R FORWARD 1 -m comment --comment "replaced 1" -j ACCEPT
> +
> +EXPECT='-A FORWARD -m comment --comment "replaced 1" -j ACCEPT
> +-A FORWARD -m comment --comment "replaced 2" -j ACCEPT
> +-A FORWARD -m comment --comment "rule 3" -j ACCEPT'
> +
> +diff -u -Z <(echo -e "$EXPECT") <(ipt_show)
> +
> +$XT_MULTI iptables -R FORWARD 3 -m comment --comment "replaced 3" -j ACCEPT
> +
> +EXPECT='-A FORWARD -m comment --comment "replaced 1" -j ACCEPT
> +-A FORWARD -m comment --comment "replaced 2" -j ACCEPT
> +-A FORWARD -m comment --comment "replaced 3" -j ACCEPT'
> +
> +diff -u -Z <(echo -e "$EXPECT") <(ipt_show)
> -- 
> 2.19.0
>
Phil Sutter Jan. 10, 2019, 10:17 a.m. UTC | #2
Hi Pablo,

On Thu, Jan 10, 2019 at 12:50:34AM +0100, Pablo Neira Ayuso wrote:
> On Sun, Dec 30, 2018 at 08:06:11PM +0100, Phil Sutter wrote:
> > iptables-restore allows to insert rules at a certain position which is
> > problematic for iptables-nft to realize since rule position is not
> > determined by number but handle of previous or following rule.
> 
> Looking at this late postprocessing in nft_action()... a bit of
> brainstorming.
> 
> Why not same approach as in rule_translate_index() in nftables? I
> guess answer is this is also incomplete, right?

Yes, that is not sufficient, see below for an example.

> If that is not enough, we could add support for NFTA_RULE_ID to
> nf_tables_newrule().  This NFTA_RULE_ID attribute provides a way to
> refer to rules coming in this batch. It is similar to NFTA_SET_ID,
> however, currently it is only supported from the rule deletion path.
> This ID is internal to the transaction - we can freely allocate IDs
> for the new rules coming in the batch. My concern is that we may have
> problems later on if we do not address limitations in the kernel
> interface.

That might help, yes. Though depending on how this will be applied and
iptables-restore input, we may still run into ugly issues.

> Or thinking if we can do this with the existing interface...  see
> below.
> 
> > To fix this, make use of the rule cache (which holds in-kernel rules
> > with their associated handle already). Insert new rules at the right
> > position into that cache, then at commit time (i.e., in nft_action())
> > traverse each chain's rule list for new rules to add:
> > 
> > * New rules at beginning of list are inserted in reverse order without
> >   reference of another rule, so that each consecutive rule is added
> >   before all previous ones.
> 
> I think this refers to rules like:
> 
> iptables -I chain 1 ...
> 
> This case we can just handle it by doing plain rule insertion.

Not quite, a user may simply do:

-I chain 1 <rule 1>
-I chain 1 <rule 2>

Legacy code inserts them in order (i.e., rule 2 follows rule 1) while
the simple approach you have in mind reverses the ordering.

> > * Each time an "old" (i.e., in-kernel) rule is encountered, its handle
> >   is stored (overwriting any previous ones').
> >
> > * New rules following an old one are appended to the old one (by
> >   specifying its handle) in reverse order, so that each consecutive rule
> >   is inserted between the old one and previously appended ones.
> 
> For existing rules, we already have an absolute reference through the
> handle, then the NLM_F_APPEND flag tells if we want to place it before
> or after that rule.
> 
> What scenario is forcing us to do this late postprocessing below that
> we cannot handle with the existing interface? :-)

Kernel has this:

-A chain <rule 1> # handle 10
-A chain <rule 2> # handle 20

User does:

-I chain 2 <rule a>
-I chain 3 <rule b>

Handling rule a in iptables-nft-restore is easy, just insert before
handle 20 or append to handle 10. If we don't add this rule to cache,
rule b will be appended to ruleset (since the rulenum it refers to is
larger than the number of rules in cache), although it should be
inserted in between rule a and rule 2.

So we need to add new rules to cache, otherwise following rule's numbers
refer to the wrong rule. Now consider a cache of:

- rule 1 # handle 10
- rule 2 # new
- rule 3 # new

And we parse this command:

-I chain 3 <rule 4>

Due to missing handle, we can neither append to rule 2 nor insert before
rule 3. How do we get rule 4 in between rule 2 and rule 3?

Given the freedom users have when creating input to
iptables-nft-restore, all this quickly turns into a can of worms. At
least I am not able to keep all possible situations in mind in order to
get the code right. That's why I chose the most straight forward
approach that occurred to me: Put all rules in cache at the right spot
and do the insert/append with handle dance later when all rules are in
place.

Cheers, Phil
Pablo Neira Ayuso Jan. 10, 2019, 10:53 p.m. UTC | #3
Hi Phil,

On Thu, Jan 10, 2019 at 11:17:36AM +0100, Phil Sutter wrote:
> Hi Pablo,
> 
> On Thu, Jan 10, 2019 at 12:50:34AM +0100, Pablo Neira Ayuso wrote:
> > On Sun, Dec 30, 2018 at 08:06:11PM +0100, Phil Sutter wrote:
> > > iptables-restore allows to insert rules at a certain position which is
> > > problematic for iptables-nft to realize since rule position is not
> > > determined by number but handle of previous or following rule.
> > 
> > Looking at this late postprocessing in nft_action()... a bit of
> > brainstorming.
> > 
> > Why not same approach as in rule_translate_index() in nftables? I
> > guess answer is this is also incomplete, right?
> 
> Yes, that is not sufficient, see below for an example.
> 
> > If that is not enough, we could add support for NFTA_RULE_ID to
> > nf_tables_newrule().  This NFTA_RULE_ID attribute provides a way to
> > refer to rules coming in this batch. It is similar to NFTA_SET_ID,
> > however, currently it is only supported from the rule deletion path.
> > This ID is internal to the transaction - we can freely allocate IDs
> > for the new rules coming in the batch. My concern is that we may have
> > problems later on if we do not address limitations in the kernel
> > interface.
> 
> That might help, yes. Though depending on how this will be applied and
> iptables-restore input, we may still run into ugly issues.

I'm under the impression that the problem is the lack of handle for
rules that are not in the kernel, see below.

> > Or thinking if we can do this with the existing interface...  see
> > below.
> > 
> > > To fix this, make use of the rule cache (which holds in-kernel rules
> > > with their associated handle already). Insert new rules at the right
> > > position into that cache, then at commit time (i.e., in nft_action())
> > > traverse each chain's rule list for new rules to add:
> > > 
> > > * New rules at beginning of list are inserted in reverse order without
> > >   reference of another rule, so that each consecutive rule is added
> > >   before all previous ones.
> > 
> > I think this refers to rules like:
> > 
> > iptables -I chain 1 ...
> > 
> > This case we can just handle it by doing plain rule insertion.
> 
> Not quite, a user may simply do:
> 
> -I chain 1 <rule 1>
> -I chain 1 <rule 2>
> 
> Legacy code inserts them in order (i.e., rule 2 follows rule 1) while
> the simple approach you have in mind reverses the ordering.

I might be misunderstanding anything from above, but legacy is
reversing the ordering here:

# cat x
# Generated by xtables-save v1.8.2 on Thu Jan 10 01:10:06 2019
*filter
-I INPUT 1 -j ACCEPT
-I INPUT 1 -j DROP
COMMIT
# Completed on Thu Jan 10 01:10:06 2019

# iptables-legacy-restore < x

# iptables-legacy-save
# Generated by iptables-save v1.8.2 on Thu Jan 10 23:12:35 2019
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A INPUT -j DROP
-A INPUT -j ACCEPT
COMMIT
# Completed on Thu Jan 10 23:12:35 2019

So insertion semantics on the same position looks correct to me.

For iptables-nft, I would place rule 1, then look for rule in position
number 1 and insert it before.

I think we have two scenarios:

#1 normal iptables-restore: in this case, kernel rules are ignored
   since we assume an implicit flush. In this case, we need a variant
   of batch_rule_add_at() that allows us to place the rules at the
   right position.

#2 --noflush is specified: in this case, two sub-scenarios:

#2.1 we find an existing rule with a handle at the position, we use the
     rule handle that we found as NFTA_RULE_POSITION to place the rule.
     We can use the existing batch_rule_add().

#2.2 we find a new rule at the position that has no handle, we use the
     batch_rule_add_at() variant that allows us to place the rule at
     the right position. The batch_rule_add_at() variant takes the
     rule we found as parameter, so we can find it in the batch
     through pointer.

> > > * Each time an "old" (i.e., in-kernel) rule is encountered, its handle
> > >   is stored (overwriting any previous ones').
> > >
> > > * New rules following an old one are appended to the old one (by
> > >   specifying its handle) in reverse order, so that each consecutive rule
> > >   is inserted between the old one and previously appended ones.
> > 
> > For existing rules, we already have an absolute reference through the
> > handle, then the NLM_F_APPEND flag tells if we want to place it before
> > or after that rule.
> > 
> > What scenario is forcing us to do this late postprocessing below that
> > we cannot handle with the existing interface? :-)
> 
> Kernel has this:
> 
> -A chain <rule 1> # handle 10
> -A chain <rule 2> # handle 20
> 
> User does:
> 
> -I chain 2 <rule a>
> -I chain 3 <rule b>
> 
> Handling rule a in iptables-nft-restore is easy, just insert before
> handle 20 or append to handle 10. If we don't add this rule to cache,
> rule b will be appended to ruleset (since the rulenum it refers to is
> larger than the number of rules in cache), although it should be
> inserted in between rule a and rule 2.

I agree we need to update the cache, we need this to walk over the
lists to find what rule is placed at what position, there is no other
way. So updating the cache is fine indeed :-).

But I think the postprocessing at nft_action() is not required, it
would be good if we can avoid this.

> So we need to add new rules to cache, otherwise following rule's numbers
> refer to the wrong rule. Now consider a cache of:
> 
> - rule 1 # handle 10
> - rule 2 # new
> - rule 3 # new
> 
> And we parse this command:
> 
> -I chain 3 <rule 4>
> 
> Due to missing handle, we can neither append to rule 2 nor insert before
> rule 3. How do we get rule 4 in between rule 2 and rule 3?

Using batch_rule_add_at() variant, we pass rule 2 as parameter, so we
can look up for it using the rule pointer in the batch.

We may need a batch_rule_insert_at() too.

Would this work for you?

Let me know if I'm missing any scenario, thanks!
Phil Sutter Jan. 11, 2019, 10:42 a.m. UTC | #4
Hi Pablo,

On Thu, Jan 10, 2019 at 11:53:15PM +0100, Pablo Neira Ayuso wrote:
> On Thu, Jan 10, 2019 at 11:17:36AM +0100, Phil Sutter wrote:
> > On Thu, Jan 10, 2019 at 12:50:34AM +0100, Pablo Neira Ayuso wrote:
> > > On Sun, Dec 30, 2018 at 08:06:11PM +0100, Phil Sutter wrote:
[...]
> > > If that is not enough, we could add support for NFTA_RULE_ID to
> > > nf_tables_newrule().  This NFTA_RULE_ID attribute provides a way to
> > > refer to rules coming in this batch. It is similar to NFTA_SET_ID,
> > > however, currently it is only supported from the rule deletion path.
> > > This ID is internal to the transaction - we can freely allocate IDs
> > > for the new rules coming in the batch. My concern is that we may have
> > > problems later on if we do not address limitations in the kernel
> > > interface.
> > 
> > That might help, yes. Though depending on how this will be applied and
> > iptables-restore input, we may still run into ugly issues.
> 
> I'm under the impression that the problem is the lack of handle for
> rules that are not in the kernel, see below.

Yes, this is probably the best choice.

> > > Or thinking if we can do this with the existing interface...  see
> > > below.
> > > 
> > > > To fix this, make use of the rule cache (which holds in-kernel rules
> > > > with their associated handle already). Insert new rules at the right
> > > > position into that cache, then at commit time (i.e., in nft_action())
> > > > traverse each chain's rule list for new rules to add:
> > > > 
> > > > * New rules at beginning of list are inserted in reverse order without
> > > >   reference of another rule, so that each consecutive rule is added
> > > >   before all previous ones.
> > > 
> > > I think this refers to rules like:
> > > 
> > > iptables -I chain 1 ...
> > > 
> > > This case we can just handle it by doing plain rule insertion.
> > 
> > Not quite, a user may simply do:
> > 
> > -I chain 1 <rule 1>
> > -I chain 1 <rule 2>
> > 
> > Legacy code inserts them in order (i.e., rule 2 follows rule 1) while
> > the simple approach you have in mind reverses the ordering.
> 
> I might be misunderstanding anything from above, but legacy is
> reversing the ordering here:

Oh, crap. You're right of course, I wrote the above having my
postprocessing algorithm in mind: When adding the above two rules to the
batch, they will be reversed (so cache contains them in correct order).
During postprocessing, I have to reconstruct the user-provided ordering,
i.e. do the insert without reference in reverse. Of course this is not
required when sticking to the old code which generates the batch jobs
right away.

[...]
> I think we have two scenarios:
> 
> #1 normal iptables-restore: in this case, kernel rules are ignored
>    since we assume an implicit flush. In this case, we need a variant
>    of batch_rule_add_at() that allows us to place the rules at the
>    right position.

Since this is the same as restore with --noflush and an empty ruleset in
kernel, there is no need to treat this separately - the existing code
which creates a flush job and removes any existing rules from cache is
fine.

> #2 --noflush is specified: in this case, two sub-scenarios:
> 
> #2.1 we find an existing rule with a handle at the position, we use the
>      rule handle that we found as NFTA_RULE_POSITION to place the rule.
>      We can use the existing batch_rule_add().
> 
> #2.2 we find a new rule at the position that has no handle, we use the
>      batch_rule_add_at() variant that allows us to place the rule at
>      the right position. The batch_rule_add_at() variant takes the
>      rule we found as parameter, so we can find it in the batch
>      through pointer.

Sounds good, I'll give it a go.

Thanks, Phil
diff mbox series

Patch

diff --git a/iptables/nft.c b/iptables/nft.c
index 1ce1ecdd276be..ab1a361905287 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -644,6 +644,7 @@  static void nft_chain_builtin_add(struct nft_handle *h,
 		return;
 
 	batch_chain_add(h, NFT_COMPAT_CHAIN_ADD, c);
+	nftnl_chain_list_add_tail(c, h->table[table->type].chain_cache);
 }
 
 /* find if built-in table already exists */
@@ -1189,7 +1190,6 @@  nft_rule_append(struct nft_handle *h, const char *chain, const char *table,
 {
 	struct nftnl_chain *c;
 	struct nftnl_rule *r;
-	int type;
 
 	/* If built-in chains don't exist for this table, create them */
 	if (nft_xtables_config_load(h, XTABLES_CONFIG_DEFAULT, 0) < 0)
@@ -1203,21 +1203,21 @@  nft_rule_append(struct nft_handle *h, const char *chain, const char *table,
 
 	if (handle > 0) {
 		nftnl_rule_set(r, NFTNL_RULE_HANDLE, &handle);
-		type = NFT_COMPAT_RULE_REPLACE;
-	} else
-		type = NFT_COMPAT_RULE_APPEND;
-
-	if (batch_rule_add(h, type, r) < 0) {
-		nftnl_rule_free(r);
-		return 0;
+		if (batch_rule_add(h, NFT_COMPAT_RULE_REPLACE, r) < 0) {
+			nftnl_rule_free(r);
+			return 0;
+		}
 	}
 
 	if (verbose)
 		h->ops->print_rule(r, 0, FMT_PRINT_RULE);
 
 	c = nft_chain_find(h, table, chain);
-	if (c)
-		nftnl_chain_rule_add_tail(r, c);
+	if (!c) {
+		errno = ENOENT;
+		return 0;
+	}
+	nftnl_chain_rule_add_tail(r, c);
 
 	return 1;
 }
@@ -2050,37 +2050,11 @@  int nft_rule_delete(struct nft_handle *h, const char *chain,
 	return ret;
 }
 
-static struct nftnl_rule *
-nft_rule_add(struct nft_handle *h, const char *chain,
-	     const char *table, struct iptables_command_state *cs,
-	     uint64_t handle, bool verbose)
-{
-	struct nftnl_rule *r;
-
-	r = nft_rule_new(h, chain, table, cs);
-	if (r == NULL)
-		return NULL;
-
-	if (handle > 0)
-		nftnl_rule_set_u64(r, NFTNL_RULE_POSITION, handle);
-
-	if (batch_rule_add(h, NFT_COMPAT_RULE_INSERT, r) < 0) {
-		nftnl_rule_free(r);
-		return NULL;
-	}
-
-	if (verbose)
-		h->ops->print_rule(r, 0, FMT_PRINT_RULE);
-
-	return r;
-}
-
 int nft_rule_insert(struct nft_handle *h, const char *chain,
 		    const char *table, void *data, int rulenum, bool verbose)
 {
-	struct nftnl_rule *r, *new_rule;
+	struct nftnl_rule *r = NULL, *new_rule;
 	struct nftnl_chain *c;
-	uint64_t handle = 0;
 
 	/* If built-in chains don't exist for this table, create them */
 	if (nft_xtables_config_load(h, XTABLES_CONFIG_DEFAULT, 0) < 0)
@@ -2095,31 +2069,23 @@  int nft_rule_insert(struct nft_handle *h, const char *chain,
 	}
 
 	if (rulenum > 0) {
-		r = nft_rule_find(h, c, data, rulenum);
+		r = nft_rule_find(h, c, data, rulenum - 1);
 		if (r == NULL) {
-			/* special case: iptables allows to insert into
-			 * rule_count + 1 position.
-			 */
-			r = nft_rule_find(h, c, data, rulenum - 1);
-			if (r != NULL)
-				return nft_rule_append(h, chain, table, data,
-						       0, verbose);
-
 			errno = ENOENT;
 			goto err;
 		}
-
-		handle = nftnl_rule_get_u64(r, NFTNL_RULE_HANDLE);
-		DEBUGP("adding after rule handle %"PRIu64"\n", handle);
 	}
 
-	new_rule = nft_rule_add(h, chain, table, data, handle, verbose);
+	new_rule = nft_rule_new(h, chain, table, data);
 	if (!new_rule)
 		goto err;
 
-	if (handle)
+	if (verbose)
+		h->ops->print_rule(new_rule, 0, FMT_PRINT_RULE);
+
+	if (r)	/* insert new rule after r */
 		nftnl_chain_rule_insert_at(new_rule, r);
-	else
+	else	/* insert new rule at beginning of list */
 		nftnl_chain_rule_add(new_rule, c);
 
 	return 1;
@@ -2280,16 +2246,8 @@  int nft_rule_list(struct nft_handle *h, const char *chain, const char *table,
 	bool found = false;
 
 	/* If built-in chains don't exist for this table, create them */
-	if (nft_xtables_config_load(h, XTABLES_CONFIG_DEFAULT, 0) < 0) {
+	if (nft_xtables_config_load(h, XTABLES_CONFIG_DEFAULT, 0) < 0)
 		nft_xt_builtin_init(h, table);
-		/* Force table and chain creation, otherwise first iptables -L
-		 * lists no table/chains.
-		 */
-		if (!list_empty(&h->obj_list)) {
-			nft_commit(h);
-			flush_chain_cache(h, NULL);
-		}
-	}
 
 	ops = nft_family_ops_lookup(h->family);
 
@@ -2395,16 +2353,8 @@  int nft_rule_list_save(struct nft_handle *h, const char *chain,
 	int ret = 0;
 
 	/* If built-in chains don't exist for this table, create them */
-	if (nft_xtables_config_load(h, XTABLES_CONFIG_DEFAULT, 0) < 0) {
+	if (nft_xtables_config_load(h, XTABLES_CONFIG_DEFAULT, 0) < 0)
 		nft_xt_builtin_init(h, table);
-		/* Force table and chain creation, otherwise first iptables -L
-		 * lists no table/chains.
-		 */
-		if (!list_empty(&h->obj_list)) {
-			nft_commit(h);
-			flush_chain_cache(h, NULL);
-		}
-	}
 
 	if (!nft_is_table_compatible(h, table)) {
 		xtables_error(OTHER_PROBLEM, "table `%s' is incompatible, use 'nft' tool.\n", table);
@@ -2523,8 +2473,8 @@  static void batch_obj_del(struct nft_handle *h, struct obj_update *o)
 		break;
 	case NFT_COMPAT_CHAIN_ZERO:
 	case NFT_COMPAT_CHAIN_USER_ADD:
-		break;
 	case NFT_COMPAT_CHAIN_ADD:
+		break;
 	case NFT_COMPAT_CHAIN_USER_DEL:
 	case NFT_COMPAT_CHAIN_USER_FLUSH:
 	case NFT_COMPAT_CHAIN_UPDATE:
@@ -2545,6 +2495,78 @@  static void batch_obj_del(struct nft_handle *h, struct obj_update *o)
 	free(o);
 }
 
+struct rule_stack_data {
+	struct nft_handle *h;
+	uint64_t handle;
+	uint32_t seq;
+};
+
+static int rule_stack_cb(struct nftnl_rule *r, void *data)
+{
+	struct rule_stack_data *rsd = data;
+	uint16_t flags = NLM_F_CREATE;
+
+	if (rsd->handle) {
+		/* append rule to previous in-kernel one */
+		flags |= NLM_F_APPEND;
+		nftnl_rule_set_u64(r, NFTNL_RULE_POSITION, rsd->handle);
+	}
+	nft_compat_rule_batch_add(rsd->h, NFT_MSG_NEWRULE,
+				  flags, rsd->seq++, r);
+	mnl_nft_batch_continue(rsd->h->batch);
+	nftnl_rule_list_del(r);
+	nftnl_rule_free(r);
+	return 0;
+}
+
+struct batch_from_chain_cache_data {
+	struct nft_handle	*handle;
+	uint32_t		seq;
+};
+
+static int batch_from_chain_cache(struct nftnl_chain *c, void *data)
+{
+	struct batch_from_chain_cache_data *d = data;
+	struct rule_stack_data rsd = {
+		.h = d->handle,
+		.seq = d->seq,
+	};
+	struct nftnl_rule_list *stack;
+	struct nftnl_rule_iter *iter;
+	struct nftnl_rule *r;
+
+	stack = nftnl_rule_list_alloc();
+	if (!stack)
+		return -1;
+
+	iter = nftnl_rule_iter_create(c);
+	if (!iter)
+		return -1;
+
+	r = nftnl_rule_iter_next(iter);
+	while (r) {
+		uint64_t handle = nftnl_rule_get_u64(r, NFTNL_RULE_HANDLE);
+
+		if (!handle) {
+			/* new rule to be added, put onto stack */
+			nftnl_rule_list_del(r);
+			nftnl_rule_list_add(r, stack);
+		} else {
+			/* rule in kernel already, handle rules in stack */
+			nftnl_rule_list_foreach(stack, rule_stack_cb, &rsd);
+			d->seq = rsd.seq;
+			rsd.handle = handle;
+		}
+		r = nftnl_rule_iter_next(iter);
+	}
+	nftnl_rule_iter_destroy(iter);
+
+	nftnl_rule_list_foreach(stack, rule_stack_cb, &rsd);
+	nftnl_rule_list_free(stack);
+	d->seq = rsd.seq;
+	return 0;
+}
+
 static int nft_action(struct nft_handle *h, int action)
 {
 	struct obj_update *n, *tmp;
@@ -2628,6 +2650,24 @@  static int nft_action(struct nft_handle *h, int action)
 		mnl_nft_batch_continue(h->batch);
 	}
 
+	if (h->have_cache) {
+		struct batch_from_chain_cache_data d = {
+			.handle = h,
+			.seq = seq,
+		};
+
+		for (i = 0; i < NFT_TABLE_MAX; i++) {
+			const char *cur_table = h->tables[i].name;
+
+			if (!cur_table)
+				continue;
+
+			nftnl_chain_list_foreach(h->table[i].chain_cache,
+						 batch_from_chain_cache, &d);
+		}
+		seq = d.seq;
+	}
+
 	switch (action) {
 	case NFT_COMPAT_COMMIT:
 		mnl_batch_end(h->batch, seq++);
diff --git a/iptables/tests/shell/testcases/ipt-restore/0003-restore-ordering_0 b/iptables/tests/shell/testcases/ipt-restore/0003-restore-ordering_0
new file mode 100755
index 0000000000000..44ee796ef44df
--- /dev/null
+++ b/iptables/tests/shell/testcases/ipt-restore/0003-restore-ordering_0
@@ -0,0 +1,94 @@ 
+#!/bin/bash
+
+# Make sure iptables-restore does the right thing
+# when encountering INSERT rules with index.
+
+set -e
+
+# show rules, drop uninteresting policy settings
+ipt_show() {
+	$XT_MULTI iptables -S | grep -v '^-P'
+}
+
+# basic issue reproducer
+
+$XT_MULTI iptables-restore <<EOF
+*filter
+-A FORWARD -m comment --comment "appended rule" -j ACCEPT
+-I FORWARD 1 -m comment --comment "rule 1" -j ACCEPT
+-I FORWARD 2 -m comment --comment "rule 2" -j ACCEPT
+-I FORWARD 3 -m comment --comment "rule 3" -j ACCEPT
+COMMIT
+EOF
+
+EXPECT='-A FORWARD -m comment --comment "rule 1" -j ACCEPT
+-A FORWARD -m comment --comment "rule 2" -j ACCEPT
+-A FORWARD -m comment --comment "rule 3" -j ACCEPT
+-A FORWARD -m comment --comment "appended rule" -j ACCEPT'
+
+diff -u -Z <(echo -e "$EXPECT") <(ipt_show)
+
+# insert rules into existing ruleset
+
+$XT_MULTI iptables-restore --noflush <<EOF
+*filter
+-I FORWARD 1 -m comment --comment "rule 0.5" -j ACCEPT
+-I FORWARD 3 -m comment --comment "rule 1.5" -j ACCEPT
+-I FORWARD 5 -m comment --comment "rule 2.5" -j ACCEPT
+-I FORWARD 7 -m comment --comment "rule 3.5" -j ACCEPT
+-I FORWARD 9 -m comment --comment "appended rule 2" -j ACCEPT
+COMMIT
+EOF
+
+EXPECT='-A FORWARD -m comment --comment "rule 0.5" -j ACCEPT
+-A FORWARD -m comment --comment "rule 1" -j ACCEPT
+-A FORWARD -m comment --comment "rule 1.5" -j ACCEPT
+-A FORWARD -m comment --comment "rule 2" -j ACCEPT
+-A FORWARD -m comment --comment "rule 2.5" -j ACCEPT
+-A FORWARD -m comment --comment "rule 3" -j ACCEPT
+-A FORWARD -m comment --comment "rule 3.5" -j ACCEPT
+-A FORWARD -m comment --comment "appended rule" -j ACCEPT
+-A FORWARD -m comment --comment "appended rule 2" -j ACCEPT'
+
+diff -u -Z <(echo -e "$EXPECT") <(ipt_show)
+
+# insert rules in between added ones
+
+$XT_MULTI iptables-restore <<EOF
+*filter
+-A FORWARD -m comment --comment "appended rule 1" -j ACCEPT
+-A FORWARD -m comment --comment "appended rule 2" -j ACCEPT
+-A FORWARD -m comment --comment "appended rule 3" -j ACCEPT
+-I FORWARD 1 -m comment --comment "rule 1" -j ACCEPT
+-I FORWARD 3 -m comment --comment "rule 2" -j ACCEPT
+-I FORWARD 5 -m comment --comment "rule 3" -j ACCEPT
+COMMIT
+EOF
+
+EXPECT='-A FORWARD -m comment --comment "rule 1" -j ACCEPT
+-A FORWARD -m comment --comment "appended rule 1" -j ACCEPT
+-A FORWARD -m comment --comment "rule 2" -j ACCEPT
+-A FORWARD -m comment --comment "appended rule 2" -j ACCEPT
+-A FORWARD -m comment --comment "rule 3" -j ACCEPT
+-A FORWARD -m comment --comment "appended rule 3" -j ACCEPT'
+
+diff -u -Z <(echo -e "$EXPECT") <(ipt_show)
+
+# test rule deletion in dump files
+
+$XT_MULTI iptables-restore --noflush <<EOF
+*filter
+-D FORWARD -m comment --comment "appended rule 1" -j ACCEPT
+-D FORWARD 3
+-I FORWARD 3 -m comment --comment "manually replaced rule 2" -j ACCEPT
+COMMIT
+EOF
+
+EXPECT='-A FORWARD -m comment --comment "rule 1" -j ACCEPT
+-A FORWARD -m comment --comment "rule 2" -j ACCEPT
+-A FORWARD -m comment --comment "manually replaced rule 2" -j ACCEPT
+-A FORWARD -m comment --comment "rule 3" -j ACCEPT
+-A FORWARD -m comment --comment "appended rule 3" -j ACCEPT'
+
+diff -u -Z <(echo -e "$EXPECT") <(ipt_show)
+
diff --git a/iptables/tests/shell/testcases/iptables/0005-rule-replace_0 b/iptables/tests/shell/testcases/iptables/0005-rule-replace_0
new file mode 100755
index 0000000000000..5a3e922e50672
--- /dev/null
+++ b/iptables/tests/shell/testcases/iptables/0005-rule-replace_0
@@ -0,0 +1,38 @@ 
+#!/bin/bash
+
+# test rule replacement
+
+set -e
+
+# show rules, drop uninteresting policy settings
+ipt_show() {
+	$XT_MULTI iptables -S | grep -v '^-P'
+}
+
+$XT_MULTI iptables -A FORWARD -m comment --comment "rule 1" -j ACCEPT
+$XT_MULTI iptables -A FORWARD -m comment --comment "rule 2" -j ACCEPT
+$XT_MULTI iptables -A FORWARD -m comment --comment "rule 3" -j ACCEPT
+
+$XT_MULTI iptables -R FORWARD 2 -m comment --comment "replaced 2" -j ACCEPT
+
+EXPECT='-A FORWARD -m comment --comment "rule 1" -j ACCEPT
+-A FORWARD -m comment --comment "replaced 2" -j ACCEPT
+-A FORWARD -m comment --comment "rule 3" -j ACCEPT'
+
+diff -u -Z <(echo -e "$EXPECT") <(ipt_show)
+
+$XT_MULTI iptables -R FORWARD 1 -m comment --comment "replaced 1" -j ACCEPT
+
+EXPECT='-A FORWARD -m comment --comment "replaced 1" -j ACCEPT
+-A FORWARD -m comment --comment "replaced 2" -j ACCEPT
+-A FORWARD -m comment --comment "rule 3" -j ACCEPT'
+
+diff -u -Z <(echo -e "$EXPECT") <(ipt_show)
+
+$XT_MULTI iptables -R FORWARD 3 -m comment --comment "replaced 3" -j ACCEPT
+
+EXPECT='-A FORWARD -m comment --comment "replaced 1" -j ACCEPT
+-A FORWARD -m comment --comment "replaced 2" -j ACCEPT
+-A FORWARD -m comment --comment "replaced 3" -j ACCEPT'
+
+diff -u -Z <(echo -e "$EXPECT") <(ipt_show)