Message ID | 20150410201719.GC5968@salvia |
---|---|
State | RFC |
Delegated to: | Pablo Neira |
Headers | show |
On 10.04, Pablo Neira Ayuso wrote: > On Fri, Apr 10, 2015 at 02:36:11PM +0100, Patrick McHardy wrote: > > > > I'm wondering if the hook is the right abstraction at all. Netfilter hooks > > require async resumption (okfn) support, which is why all the refactoring is > > needed. Is that something that we need for NF_PROTO_NETDEV? For ingress > > userspace queueing *might* actually work if the missing pieces are added, > > but for offloaded rules it obviously can not work. > > For userspace queueing from ingress we still have to call > skb_share_check() and hold a reference to orig_dev from the escape > path. But this support is still missing in nf_tables (actually, we > only support NFPROTO_IPV4 and NFPROTO_IPV6 at this moment, see patch > attached). Regarding offload, this path will not see any packet. We do support all families using the regular NF_QUEUE verdict of course. But yes, nf_queue.c will simply drop packets that don't have a netfilter AF registered. But my question is whether queueing is something that is even worth considering for the NFPROTO_NETDEV family. As I said, it will at best work for ingress anyways and that will actually be more tricky than just calling skb_share_check(), we need to take care of keeping valid references to all the data you currently store in the CB, including the packet_type, the device, things attached to the skb at this point to the stack etc. If we decide not to support queueing for this family we don't have to use netfilter hooks for this and all the refactoring for async resume becomes unnecessary. > >From db2fba74dea98b69ee7615fca86b9847bc42887f Mon Sep 17 00:00:00 2001 > From: Pablo Neira Ayuso <pablo@netfilter.org> > Date: Fri, 10 Apr 2015 21:40:58 +0200 > Subject: [PATCH] netfilter: nf_tables: restrict nft_queue to AF_INET and > AF_INET6 > > Other families need the corresponding struct nf_afinfo in place to work. > Restrict it to NFPROTO_IPV4 and NFPROTO_IPV6 until the necessary code is in > place. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Apr 10, 2015 at 10:33:12PM +0100, Patrick McHardy wrote: > On 10.04, Pablo Neira Ayuso wrote: > > On Fri, Apr 10, 2015 at 02:36:11PM +0100, Patrick McHardy wrote: > > > > > > I'm wondering if the hook is the right abstraction at all. Netfilter hooks > > > require async resumption (okfn) support, which is why all the refactoring is > > > needed. Is that something that we need for NF_PROTO_NETDEV? For ingress > > > userspace queueing *might* actually work if the missing pieces are added, > > > but for offloaded rules it obviously can not work. > > > > For userspace queueing from ingress we still have to call > > skb_share_check() and hold a reference to orig_dev from the escape > > path. But this support is still missing in nf_tables (actually, we > > only support NFPROTO_IPV4 and NFPROTO_IPV6 at this moment, see patch > > attached). Regarding offload, this path will not see any packet. > > We do support all families using the regular NF_QUEUE verdict of course. > But yes, nf_queue.c will simply drop packets that don't have a netfilter > AF registered. > > But my question is whether queueing is something that is even worth > considering for the NFPROTO_NETDEV family. As I said, it will at best > work for ingress anyways and that will actually be more tricky than just > calling skb_share_check(), we need to take care of keeping valid > references to all the data you currently store in the CB, including the > packet_type, the device, things attached to the skb at this point to > the stack etc. I think we only need to hold the reference on orig_dev. The pt_prev pointer in skb CB can actually be removed. Other things attached to the skb we already handle this from nf_queue to make sure they don't vanish. > If we decide not to support queueing for this family we don't have to > use netfilter hooks for this and all the refactoring for async resume > becomes unnecessary. I think the refactoring is worth. Have a look at the current state of this function. It has grown with features along time and it got many gotos that force you travel back and forth when reading this code. Regarding the nf_queue support at ingress, I don't see any major technical obstacule at this moment to support this and I think that existing programs that inspect traffic from userspace can benefit from this feature (eg. IPS). -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11.04, Pablo Neira Ayuso wrote: > On Fri, Apr 10, 2015 at 10:33:12PM +0100, Patrick McHardy wrote: > > On 10.04, Pablo Neira Ayuso wrote: > > > On Fri, Apr 10, 2015 at 02:36:11PM +0100, Patrick McHardy wrote: > > We do support all families using the regular NF_QUEUE verdict of course. > > But yes, nf_queue.c will simply drop packets that don't have a netfilter > > AF registered. > > > > But my question is whether queueing is something that is even worth > > considering for the NFPROTO_NETDEV family. As I said, it will at best > > work for ingress anyways and that will actually be more tricky than just > > calling skb_share_check(), we need to take care of keeping valid > > references to all the data you currently store in the CB, including the > > packet_type, the device, things attached to the skb at this point to > > the stack etc. > > I think we only need to hold the reference on orig_dev. The pt_prev > pointer in skb CB can actually be removed. Other things attached to > the skb we already handle this from nf_queue to make sure they don't > vanish. Are you sure? What about removable protocols or packet sockets? > > If we decide not to support queueing for this family we don't have to > > use netfilter hooks for this and all the refactoring for async resume > > becomes unnecessary. > > I think the refactoring is worth. Have a look at the current state of > this function. It has grown with features along time and it got many > gotos that force you travel back and forth when reading this code. > > Regarding the nf_queue support at ingress, I don't see any major > technical obstacule at this moment to support this and I think that > existing programs that inspect traffic from userspace can benefit from > this feature (eg. IPS). Yeah, that might be useful, although they seem to be pretty fine with getting only IPv4 and IPv6. I guess ARP might be interesting as well, but we also have hooks for that already. Regarding the refactoring, there seem to be concerns about performance impact. My suggestions would be to use nf_hook(), make sure no queueing can happen and therefore no okfn invocations and then you can simply add this as a function call to the existing code without the need for any refactoring or storing state. You don't loose anything, it only massively simplifies the patches. If queuing supported is added, you can still change it. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Apr 11, 2015 at 02:06:48PM +0100, Patrick McHardy wrote: > On 11.04, Pablo Neira Ayuso wrote: > > On Fri, Apr 10, 2015 at 10:33:12PM +0100, Patrick McHardy wrote: > > > On 10.04, Pablo Neira Ayuso wrote: > > > > On Fri, Apr 10, 2015 at 02:36:11PM +0100, Patrick McHardy wrote: > > > We do support all families using the regular NF_QUEUE verdict of course. > > > But yes, nf_queue.c will simply drop packets that don't have a netfilter > > > AF registered. > > > > > > But my question is whether queueing is something that is even worth > > > considering for the NFPROTO_NETDEV family. As I said, it will at best > > > work for ingress anyways and that will actually be more tricky than just > > > calling skb_share_check(), we need to take care of keeping valid > > > references to all the data you currently store in the CB, including the > > > packet_type, the device, things attached to the skb at this point to > > > the stack etc. > > > > I think we only need to hold the reference on orig_dev. The pt_prev > > pointer in skb CB can actually be removed. Other things attached to > > the skb we already handle this from nf_queue to make sure they don't > > vanish. > > Are you sure? What about removable protocols or packet sockets? pt_prev will be always NULL if we enter the netfilter ingress hook, so no need to store it. > > > If we decide not to support queueing for this family we don't have to > > > use netfilter hooks for this and all the refactoring for async resume > > > becomes unnecessary. > > > > I think the refactoring is worth. Have a look at the current state of > > this function. It has grown with features along time and it got many > > gotos that force you travel back and forth when reading this code. > > > > Regarding the nf_queue support at ingress, I don't see any major > > technical obstacule at this moment to support this and I think that > > existing programs that inspect traffic from userspace can benefit from > > this feature (eg. IPS). > > Yeah, that might be useful, although they seem to be pretty fine with > getting only IPv4 and IPv6. I guess ARP might be interesting as well, > but we also have hooks for that already. For security applications, I guess they will be happy to get pretty much everything that they can inspect. > Regarding the refactoring, there seem to be concerns about performance > impact. My suggestions would be to use nf_hook(), make sure no queueing > can happen and therefore no okfn invocations and then you can simply > add this as a function call to the existing code without the need for > any refactoring or storing state. I'll come back with numbers and more feedback anyway. > You don't loose anything, it only massively simplifies the patches. If > queuing supported is added, you can still change it. I'll explore this, this seems like a good alternative if performance becomes a real issue. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From db2fba74dea98b69ee7615fca86b9847bc42887f Mon Sep 17 00:00:00 2001 From: Pablo Neira Ayuso <pablo@netfilter.org> Date: Fri, 10 Apr 2015 21:40:58 +0200 Subject: [PATCH] netfilter: nf_tables: restrict nft_queue to AF_INET and AF_INET6 Other families need the corresponding struct nf_afinfo in place to work. Restrict it to NFPROTO_IPV4 and NFPROTO_IPV6 until the necessary code is in place. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> --- net/netfilter/nft_queue.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/netfilter/nft_queue.c b/net/netfilter/nft_queue.c index e8ae2f6..42ca976 100644 --- a/net/netfilter/nft_queue.c +++ b/net/netfilter/nft_queue.c @@ -129,4 +129,5 @@ module_exit(nft_queue_module_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Eric Leblond <eric@regit.org>"); -MODULE_ALIAS_NFT_EXPR("queue"); +MODULE_ALIAS_NFT_AF_EXPR(AF_INET, "queue"); +MODULE_ALIAS_NFT_AF_EXPR(AF_INET6, "queue"); -- 1.7.10.4