Message ID | 20190721001406.23785-4-fw@strlen.de |
---|---|
State | Changes Requested |
Delegated to: | Pablo Neira |
Headers | show |
Series | fix crash bug during rule restore | expand |
On Sun, Jul 21, 2019 at 02:14:07AM +0200, Florian Westphal wrote: > This makes nft behave like 0.9.0 -- the ruleset > > flush ruleset > table inet filter { > } > table inet filter { > chain test { > counter > } > } > > loads again without generating an error message. > I've added a test case for this, without this it will create an error, > and with a checkout of the 'fixes' tag we get crash. > > Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1351 > Fixes: e5382c0d08e3c ("src: Support intra-transaction rule references") This one is causing the cache corruption, right? > Signed-off-by: Florian Westphal <fw@strlen.de> > --- > src/evaluate.c | 3 +++ > tests/shell/testcases/cache/0003_cache_update_0 | 12 ++++++++++++ > 2 files changed, 15 insertions(+) > > diff --git a/src/evaluate.c b/src/evaluate.c > index b56932ccabcc..8c1c82abed4e 100644 > --- a/src/evaluate.c > +++ b/src/evaluate.c > @@ -3258,6 +3258,9 @@ static int rule_cache_update(struct eval_ctx *ctx, enum cmd_ops op) > struct table *table; > struct chain *chain; > > + if (op == CMD_INVALID) > + return 0; > + > table = table_lookup(&rule->handle, &ctx->nft->cache); > if (!table) > return table_not_found(ctx); > diff --git a/tests/shell/testcases/cache/0003_cache_update_0 b/tests/shell/testcases/cache/0003_cache_update_0 > index 05edc9c7c33e..fb4b0e24c790 100755 > --- a/tests/shell/testcases/cache/0003_cache_update_0 > +++ b/tests/shell/testcases/cache/0003_cache_update_0 > @@ -48,3 +48,15 @@ $NFT -f - >/dev/null <<EOF > add rule ip t4 c meta l4proto igmp accept > add rule ip t4 c index 2 drop > EOF > + > +# Trigger a crash or rule restore error with nft 0.9.1 > +$NFT -f - >/dev/null <<EOF > +flush ruleset > +table inet testfilter { > +} > +table inet testfilter { > + chain test { > + counter > + } > +} > +EOF > -- > 2.21.0 >
Pablo Neira Ayuso <pablo@netfilter.org> wrote: > On Sun, Jul 21, 2019 at 02:14:07AM +0200, Florian Westphal wrote: > > This makes nft behave like 0.9.0 -- the ruleset > > > > flush ruleset > > table inet filter { > > } > > table inet filter { > > chain test { > > counter > > } > > } > > > > loads again without generating an error message. > > I've added a test case for this, without this it will create an error, > > and with a checkout of the 'fixes' tag we get crash. > > > > Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1351 > > Fixes: e5382c0d08e3c ("src: Support intra-transaction rule references") > > This one is causing the cache corruption, right? There is no cache corruption. This patch makes us enter a code path that we did not take before.
On Sun, Jul 21, 2019 at 08:50:40PM +0200, Florian Westphal wrote: > Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > On Sun, Jul 21, 2019 at 02:14:07AM +0200, Florian Westphal wrote: > > > This makes nft behave like 0.9.0 -- the ruleset > > > > > > flush ruleset > > > table inet filter { > > > } > > > table inet filter { > > > chain test { > > > counter > > > } > > > } > > > > > > loads again without generating an error message. > > > I've added a test case for this, without this it will create an error, > > > and with a checkout of the 'fixes' tag we get crash. > > > > > > Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1351 > > > Fixes: e5382c0d08e3c ("src: Support intra-transaction rule references") > > > > This one is causing the cache corruption, right? > > There is no cache corruption. This patch makes us enter a code > path that we did not take before. Sorry, I mean, this is a cache bug :-) cache_flush() is cheating, it sets flags to CACHE_FULL, hence this enters this codepath we dit not take before. This propagates from the previous logic, as a workaround. I made this patch, to skip the cache in case "flush ruleset" is requested. This breaks testcases/transactions/0024rule_0, which is a recent test from Phil to check for intra-transaction references, I don't know yet what makes this code unhappy with my changes. Phil, would you help me have a look at what assumption breaks? Thanks.
On Mon, Jul 22, 2019 at 11:25:56PM +0200, Pablo Neira Ayuso wrote: > On Sun, Jul 21, 2019 at 08:50:40PM +0200, Florian Westphal wrote: > > Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > On Sun, Jul 21, 2019 at 02:14:07AM +0200, Florian Westphal wrote: > > > > This makes nft behave like 0.9.0 -- the ruleset > > > > > > > > flush ruleset > > > > table inet filter { > > > > } > > > > table inet filter { > > > > chain test { > > > > counter > > > > } > > > > } > > > > > > > > loads again without generating an error message. > > > > I've added a test case for this, without this it will create an error, > > > > and with a checkout of the 'fixes' tag we get crash. > > > > > > > > Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1351 > > > > Fixes: e5382c0d08e3c ("src: Support intra-transaction rule references") > > > > > > This one is causing the cache corruption, right? > > > > There is no cache corruption. This patch makes us enter a code > > path that we did not take before. > > Sorry, I mean, this is a cache bug :-) > > cache_flush() is cheating, it sets flags to CACHE_FULL, hence this > enters this codepath we dit not take before. This propagates from the > previous logic, as a workaround. > > I made this patch, to skip the cache in case "flush ruleset" is > requested. > > This breaks testcases/transactions/0024rule_0, which is a recent test > from Phil to check for intra-transaction references, I don't know yet > what makes this code unhappy with my changes. > > Phil, would you help me have a look at what assumption breaks? Thanks. Sorry, I don't get it. What is happening in the first place? Florian writes, a lookup happens in the wrong table and it seems chain_evaluate() doesn't add the chain to cache. Yet I don't understand why given patch fixes the problem. Cheers, Phil
On Tue, Jul 23, 2019 at 03:11:42PM +0200, Phil Sutter wrote: > On Mon, Jul 22, 2019 at 11:25:56PM +0200, Pablo Neira Ayuso wrote: > > On Sun, Jul 21, 2019 at 08:50:40PM +0200, Florian Westphal wrote: > > > Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > > > On Sun, Jul 21, 2019 at 02:14:07AM +0200, Florian Westphal wrote: > > > > > This makes nft behave like 0.9.0 -- the ruleset > > > > > > > > > > flush ruleset > > > > > table inet filter { > > > > > } > > > > > table inet filter { > > > > > chain test { > > > > > counter > > > > > } > > > > > } > > > > > > > > > > loads again without generating an error message. > > > > > I've added a test case for this, without this it will create an error, > > > > > and with a checkout of the 'fixes' tag we get crash. > > > > > > > > > > Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1351 > > > > > Fixes: e5382c0d08e3c ("src: Support intra-transaction rule references") > > > > > > > > This one is causing the cache corruption, right? > > > > > > There is no cache corruption. This patch makes us enter a code > > > path that we did not take before. > > > > Sorry, I mean, this is a cache bug :-) > > > > cache_flush() is cheating, it sets flags to CACHE_FULL, hence this > > enters this codepath we dit not take before. This propagates from the > > previous logic, as a workaround. > > > > I made this patch, to skip the cache in case "flush ruleset" is > > requested. > > > > This breaks testcases/transactions/0024rule_0, which is a recent test > > from Phil to check for intra-transaction references, I don't know yet > > what makes this code unhappy with my changes. > > > > Phil, would you help me have a look at what assumption breaks? Thanks. > > Sorry, I don't get it. What is happening in the first place? Florian > writes, a lookup happens in the wrong table and it seems > chain_evaluate() doesn't add the chain to cache. Yet I don't understand > why given patch fixes the problem. Just sent a patch for this.
On Sun, Jul 21, 2019 at 02:14:07AM +0200, Florian Westphal wrote: > This makes nft behave like 0.9.0 -- the ruleset > > flush ruleset > table inet filter { > } > table inet filter { > chain test { > counter > } > } > > loads again without generating an error message. > I've added a test case for this, without this it will create an error, > and with a checkout of the 'fixes' tag we get crash. > > Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1351 This one should fix this bugzilla: http://git.netfilter.org/nftables/commit/?id=3ab02db5f836ae0cf9fe7fba616d7eb52139d537 more comments below. [...] > diff --git a/tests/shell/testcases/cache/0003_cache_update_0 b/tests/shell/testcases/cache/0003_cache_update_0 > index 05edc9c7c33e..fb4b0e24c790 100755 > --- a/tests/shell/testcases/cache/0003_cache_update_0 > +++ b/tests/shell/testcases/cache/0003_cache_update_0 > @@ -48,3 +48,15 @@ $NFT -f - >/dev/null <<EOF > add rule ip t4 c meta l4proto igmp accept > add rule ip t4 c index 2 drop > EOF > + > +# Trigger a crash or rule restore error with nft 0.9.1 > +$NFT -f - >/dev/null <<EOF > +flush ruleset > +table inet testfilter { > +} > +table inet testfilter { > + chain test { > + counter > + } > +} > +EOF I have applied this test as an separated patch, thanks Florian.
Pablo Neira Ayuso <pablo@netfilter.org> wrote: > This one should fix this bugzilla: > > http://git.netfilter.org/nftables/commit/?id=3ab02db5f836ae0cf9fe7fba616d7eb52139d537 much better fix, thanks Pablo!
diff --git a/src/evaluate.c b/src/evaluate.c index b56932ccabcc..8c1c82abed4e 100644 --- a/src/evaluate.c +++ b/src/evaluate.c @@ -3258,6 +3258,9 @@ static int rule_cache_update(struct eval_ctx *ctx, enum cmd_ops op) struct table *table; struct chain *chain; + if (op == CMD_INVALID) + return 0; + table = table_lookup(&rule->handle, &ctx->nft->cache); if (!table) return table_not_found(ctx); diff --git a/tests/shell/testcases/cache/0003_cache_update_0 b/tests/shell/testcases/cache/0003_cache_update_0 index 05edc9c7c33e..fb4b0e24c790 100755 --- a/tests/shell/testcases/cache/0003_cache_update_0 +++ b/tests/shell/testcases/cache/0003_cache_update_0 @@ -48,3 +48,15 @@ $NFT -f - >/dev/null <<EOF add rule ip t4 c meta l4proto igmp accept add rule ip t4 c index 2 drop EOF + +# Trigger a crash or rule restore error with nft 0.9.1 +$NFT -f - >/dev/null <<EOF +flush ruleset +table inet testfilter { +} +table inet testfilter { + chain test { + counter + } +} +EOF
This makes nft behave like 0.9.0 -- the ruleset flush ruleset table inet filter { } table inet filter { chain test { counter } } loads again without generating an error message. I've added a test case for this, without this it will create an error, and with a checkout of the 'fixes' tag we get crash. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1351 Fixes: e5382c0d08e3c ("src: Support intra-transaction rule references") Signed-off-by: Florian Westphal <fw@strlen.de> --- src/evaluate.c | 3 +++ tests/shell/testcases/cache/0003_cache_update_0 | 12 ++++++++++++ 2 files changed, 15 insertions(+)