diff mbox

[net] nftables: use list_for_each_entry_safe_reverse to traversal commit_list in nf_tables_abort

Message ID 15bcb964221e1a9498e901417d020609cd5aac65.1449485287.git.lucien.xin@gmail.com
State Accepted
Delegated to: Pablo Neira
Headers show

Commit Message

Xin Long Dec. 7, 2015, 10:48 a.m. UTC
when we use 'nft -f' to sumbit rules, it will build multiple rules into
one netlink skb to send to kernel, kernel will process them one by one.
meanwhile, it add the trans into commit_list to record every commit.
if one of them's return value is -EAGAIN, status |= NFNL_BATCH_REPLAY
will be marked. after all the process is done. it will roll back all the
commits.

now kernel use list_add_tail to add trans to commit, and use
list_for_each_entry_safe to roll back. which means the order of adding
and rollback is the same. that will cause some cases cannot work well,
even trigger call trace, like:

1. add a set into table foo  [return -EAGAIN]:
   commit_list = 'add set trans'
2. del foo:
   commit_list = 'add set trans' -> 'del set trans' -> 'del tab trans'
then nf_tables_abort will be called to roll back:
firstly process 'add set trans':
                   case NFT_MSG_NEWSET:
                        trans->ctx.table->use--;
                        list_del_rcu(&nft_trans_set(trans)->list);

  it will del the set from the table foo, but it has removed when del
  table foo [step 2], then the kernel will panic.

the right order of rollback should be:
  'del tab trans' -> 'del set trans' -> 'add set trans'.
which is opposite with commit_list order.

so fix it by rolling back commits with reverse order in nf_tables_abort.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/netfilter/nf_tables_api.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Pablo Neira Ayuso Dec. 9, 2015, 2:03 p.m. UTC | #1
On Mon, Dec 07, 2015 at 06:48:07PM +0800, Xin Long wrote:
> when we use 'nft -f' to sumbit rules, it will build multiple rules into
> one netlink skb to send to kernel, kernel will process them one by one.
> meanwhile, it add the trans into commit_list to record every commit.
> if one of them's return value is -EAGAIN, status |= NFNL_BATCH_REPLAY
> will be marked. after all the process is done. it will roll back all the
> commits.
> 
> now kernel use list_add_tail to add trans to commit, and use
> list_for_each_entry_safe to roll back. which means the order of adding
> and rollback is the same. that will cause some cases cannot work well,
> even trigger call trace, like:
> 
> 1. add a set into table foo  [return -EAGAIN]:
>    commit_list = 'add set trans'
> 2. del foo:
>    commit_list = 'add set trans' -> 'del set trans' -> 'del tab trans'
> then nf_tables_abort will be called to roll back:
> firstly process 'add set trans':
>                    case NFT_MSG_NEWSET:
>                         trans->ctx.table->use--;
>                         list_del_rcu(&nft_trans_set(trans)->list);
> 
>   it will del the set from the table foo, but it has removed when del
>   table foo [step 2], then the kernel will panic.
> 
> the right order of rollback should be:
>   'del tab trans' -> 'del set trans' -> 'add set trans'.
> which is opposite with commit_list order.
> 
> so fix it by rolling back commits with reverse order in nf_tables_abort.


You're reporting a kernel panic.

Could you please provide a sequence of commands to reproduce it with
the existing code?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xin Long Dec. 9, 2015, 4:19 p.m. UTC | #2
ok, the reproducer:
1.
#nft delete table foo
#nft add table foo
#nft list tables
#nft list table foo
#nft add chain foo bar
#nft add chain foo baz
#nft add chain foo bok
#nft list table foo

2. #nft -f panic.rules
------panic.rules-------
add rule foo bar ip saddr 127.0.0.1 accept
add rule foo bar ip saddr {192.168.1.2, 192.168.2.3} jump baz
add rule foo bar ip saddr {192.168.1.2, 192.168.2.3} jump bok

add rule foo baz ip saddr {192.168.1.2, 192.168.2.3} jump bok
add rule foo bok ip saddr {192.168.1.2, 192.168.2.3} jump baz

delete table foo
-------end-----------
the panic will happen 1/1
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xin Long Dec. 9, 2015, 4:24 p.m. UTC | #3
On Wed, Dec 9, 2015 at 10:03 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Mon, Dec 07, 2015 at 06:48:07PM +0800, Xin Long wrote:
>> when we use 'nft -f' to sumbit rules, it will build multiple rules into
>> one netlink skb to send to kernel, kernel will process them one by one.
>> meanwhile, it add the trans into commit_list to record every commit.
>> if one of them's return value is -EAGAIN, status |= NFNL_BATCH_REPLAY
>> will be marked. after all the process is done. it will roll back all the
>> commits.
>>
>> now kernel use list_add_tail to add trans to commit, and use
>> list_for_each_entry_safe to roll back. which means the order of adding
>> and rollback is the same. that will cause some cases cannot work well,
>> even trigger call trace, like:
>>
>> 1. add a set into table foo  [return -EAGAIN]:
>>    commit_list = 'add set trans'
>> 2. del foo:
>>    commit_list = 'add set trans' -> 'del set trans' -> 'del tab trans'
>> then nf_tables_abort will be called to roll back:
>> firstly process 'add set trans':
>>                    case NFT_MSG_NEWSET:
>>                         trans->ctx.table->use--;
>>                         list_del_rcu(&nft_trans_set(trans)->list);
>>
>>   it will del the set from the table foo, but it has removed when del
>>   table foo [step 2], then the kernel will panic.
>>
>> the right order of rollback should be:
>>   'del tab trans' -> 'del set trans' -> 'add set trans'.
>> which is opposite with commit_list order.
>>
>> so fix it by rolling back commits with reverse order in nf_tables_abort.
>
>
> You're reporting a kernel panic.
>
> Could you please provide a sequence of commands to reproduce it with
> the existing code?
>
> Thanks.

the reproduce is a kind of long, do i need to repost this patch with
the reproduce?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso Dec. 13, 2015, 9:47 p.m. UTC | #4
On Thu, Dec 10, 2015 at 12:24:21AM +0800, Xin Long wrote:
> On Wed, Dec 9, 2015 at 10:03 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > On Mon, Dec 07, 2015 at 06:48:07PM +0800, Xin Long wrote:
[...]
> >> the right order of rollback should be:
> >>   'del tab trans' -> 'del set trans' -> 'add set trans'.
> >> which is opposite with commit_list order.
> >>
> >> so fix it by rolling back commits with reverse order in nf_tables_abort.
> >
> >
> > You're reporting a kernel panic.
> >
> > Could you please provide a sequence of commands to reproduce it with
> > the existing code?
> >
> > Thanks.
> 
> the reproduce is a kind of long, do i need to repost this patch with
> the reproduce?

No need to resend.

Yes, we need this reverse iteration there to handle the 'delete table'
command in the batch. This problem happens since we have
nft_flush_table().

Other callsites are artificially restriction deletion of inactive
objects but that should be removed as we already discuss on the
mailing list.

So I'm applying this, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xin Long Feb. 1, 2016, 10:47 a.m. UTC | #5
>
> No need to resend.
>
> Yes, we need this reverse iteration there to handle the 'delete table'
> command in the batch. This problem happens since we have
> nft_flush_table().
>
> Other callsites are artificially restriction deletion of inactive
> objects but that should be removed as we already discuss on the
> mailing list.
>
> So I'm applying this, thanks.

Hi Pablo, has this one been applied ?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso Feb. 1, 2016, 11:08 a.m. UTC | #6
On Mon, Feb 01, 2016 at 06:47:33PM +0800, Xin Long wrote:
> >
> > No need to resend.
> >
> > Yes, we need this reverse iteration there to handle the 'delete table'
> > command in the batch. This problem happens since we have
> > nft_flush_table().
> >
> > Other callsites are artificially restriction deletion of inactive
> > objects but that should be removed as we already discuss on the
> > mailing list.
> >
> > So I'm applying this, thanks.
> 
> Hi Pablo, has this one been applied ?

commit a907e36d54e0ff836e55e04531be201bf6b4d8c8
Author: Xin Long <lucien.xin@gmail.com>
Date:   Mon Dec 7 18:48:07 2015 +0800

    netfilter: nf_tables: use reverse traversal commit_list in nf_tables_abort

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 93cc473..4511a78 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -4030,7 +4030,8 @@  static int nf_tables_abort(struct sk_buff *skb)
 	struct nft_trans *trans, *next;
 	struct nft_trans_elem *te;
 
-	list_for_each_entry_safe(trans, next, &net->nft.commit_list, list) {
+	list_for_each_entry_safe_reverse(trans, next, &net->nft.commit_list,
+					 list) {
 		switch (trans->msg_type) {
 		case NFT_MSG_NEWTABLE:
 			if (nft_trans_table_update(trans)) {