Message ID | 20200203163707.27254-1-fw@strlen.de |
---|---|
Headers | show |
Series | netfilter: conntrack: allow insertion of clashing entries | expand |
On Mon, Feb 03, 2020 at 05:37:03PM +0100, Florian Westphal wrote: > This series allows conntrack to insert a duplicate conntrack entry > if the reply direction doesn't result in a clash with a different > original connection. Applied, thanks for your patience. I introduced the late clash resolution approach to deal with nfqueue, now this is extended to cover more cases, let's give it a try. >Alternatives considered were: >1. Confirm ct entries at allocation time, not in postrouting. > a. will cause uneccesarry work when the skb that creates the > conntrack is dropped by ruleset. > b. in case nat is applied, ct entry would need to be moved in > the table, which requires another spinlock pair to be taken. > c. breaks the 'unconfirmed entry is private to cpu' assumption: > we would need to guard all nfct->ext allocation requests with > ct->lock spinlock. > >2. Make the unconfirmed list a hash table instead of a pcpu list. > Shares drawback c) of the first alternative. The spinlock would need to be grabbed rarely, right? My mean, most extension allocations happen before insertion to the unconfirmed list. Only _ext_add() invocations coming after init_conntrack() might require this. Thanks.
Pablo Neira Ayuso <pablo@netfilter.org> wrote: > On Mon, Feb 03, 2020 at 05:37:03PM +0100, Florian Westphal wrote: > > This series allows conntrack to insert a duplicate conntrack entry > > if the reply direction doesn't result in a clash with a different > > original connection. > > Applied, thanks for your patience. > > I introduced the late clash resolution approach to deal with nfqueue, > now this is extended to cover more cases, let's give it a try. Yes, nfqueue is one way this can happen, changes to resolver libraries to issue parallel requests have exposed this race for non-nfqueue case too. > >Alternatives considered were: > >1. Confirm ct entries at allocation time, not in postrouting. > > a. will cause uneccesarry work when the skb that creates the > > conntrack is dropped by ruleset. > > b. in case nat is applied, ct entry would need to be moved in > > the table, which requires another spinlock pair to be taken. > > c. breaks the 'unconfirmed entry is private to cpu' assumption: > > we would need to guard all nfct->ext allocation requests with > > ct->lock spinlock. > > > >2. Make the unconfirmed list a hash table instead of a pcpu list. > > Shares drawback c) of the first alternative. > > The spinlock would need to be grabbed rarely, right? My mean, most > extension allocations happen before insertion to the unconfirmed list. > Only _ext_add() invocations coming after init_conntrack() might > require this. Right, we could add __nf_ct_ext_add() which is unlocked and convert the additions happening before unconfirmed list insertion there. But there are additional problems that I forgot: a) need for one additional lookup after negative result from main table (this time in unconfirmed list). b) Need to asynchronously re-insert the skb at a later time, once the racing entry is confirmed. We can't use the unconfirmed ct as-is, because it may be incomplete. For instance, the racing skb might not yet have hit the nat table, so the ct contains wrong NAT info. I think b) is a non-starter for all of the alternatives, unfortunately.